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0.  Introduction  and  Sunrary.  This  jjaper  extends  and  unifies  some 
previous  formulations  and  theories  of  estimation  for  one-parameter 
problems.   The  basic  criterion  used  is  admissibility  of  a  point 
estimator,  defined  with  reference  to  its  full  distribution  rather 
than  special  loss  functions  such  as  squared  error.   Theoretical 
methods  of  characterizing  admissible  estimators  are  given,  and 
practical  comput':^ tional  m.ethods  for  their  use  are  illustrated  in 
a  variety  of  examples. 

Point,  confidence  limit,  and  confidence  interval  estimation  are 
included  in  a  single  theoretical  formulation,  and  incorporated  into 
estimators  of  an  "omnibus"  form  called  "confidence  curves,"   The 
usefulness  of  the  latter  for  some  rpplicatlons  as  well  as  theoret- 
ical purposes  is  Illustrated, 

Wisher's  maximum  likelihood  principle  of  estimation  is  general- 
ized, given  excct  (non-asymptotic)  justification,  and  unified  with 
the  theory  of  tests  and  confidence  regions  of  Keyman  and  Pearson. 
Relations  between  exact  and  asymptotic  results  are  discussed. 

An  application  of  the  general  theory  gives  optimal  sequential 
estimators  having  prescribed  precision  in  a  specified  Interval, 

Further  developments,  including  multiparameter  and  nuisance  para- 
meter problems,  problems  of  choice  among  admissible  estimators, 
formal  and  informal  criteria  for  optimallty,  and  related  problems 
in  the  foundations  of  statistical  Inference,  will  be  presented  sub- 
sequently. 


'1.1 


1,  A  broad  fornulaticn  ci    the  problem  of  point  est imstion.  We  con- 
sider problems  of  estimation  vjith  reference  to  a  specified  experi- 
ment E,  leaving  aside  here  questions  of  experimental  design  includ- 
ing those  of  choice  of  a  sample  size  or  a  sequential  sampling  rulej 
some  definite  sampling  rule,  possibly  sequential,  is  assumed  speci- 
fied as  part  of  E.   Let  S  =/x^  denote  the  s?mple  spnce  of  possible 
outcomes  x  of  the  experiment,  Let  f(x,0)  denote  one  of  the  element- 
ary probability  functions  on  S  which  .-re  specified  as  possibly  true. 
Let  A  =  x^     denote  the  specified  parameter  space,   i^'^or  each  0  in  i  ^ 
and  for  each  subset  of  A  of  S,  the  probability  that  E  yields  an 
outcome  x  in  A  is  given  by 


'•  X  e  A|Q  I  =  {      f  (x,0)  d^(x). 


Prob 

vjhere  ti  is  a  specified  c"-  finite  measure  on  S,   (Vie  assume  tacitly 
here  and  belovj  that  consideration  is  appropriately  restricted  to 
measurable  sets  and  functions  only.) 

If  Y  =  yC^)  is  any  function  defined  on  D-(e.g.  y(^)  =  ^  cr 
y(^)  =  0  ),  with  ranre  '  ,  a  point  estimator  of  y  is  any  measurable 
function  g  =  g(x)  taking  values  in  ['(or  in  T,  its  closure,  if,  for 
example,  ('is  an  open  interval).   The  problem  of  choosing  a  good 
estimator,  that  is  an  estimator  which  tends  to  take  values  close  to 
the  true  unknc^^Jn  value  of  y,  has  been  formulated  mathematically   in 
various  ways.   Most  formulations  achieve  mathematical  definiteness 
by  introducing  criteria  of  closeness  which  appear  somewhat  arbitrary 
from  some  standpoints  of  application  and  undesirably  schematic  as 
expressions  of  the  intuitive  notion  of  closeness. 

If  il  is  given  no  specific  (parametric)  structure,  then  the 
latter  features  can  be  fully  avoided  only  by  a  very  broad  formulation 
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which  specifies  only  that  ir  y  is  true,  then  an  exactly  correct 
estimate  (g  =  y)  is  closer  th::n  any  incorrect  estimate  (g  ^  y) ,    If 
iX  is  finite,  -0-=  ^i,'"%    ,    snd  y(")  =  ^,    this  leads  to  the 
formulation  of  Lindley  [1]  in  which  estimators  are  compared  only 
on  the  basis  of  their  error  probabilities 

p^^  =  Prob  [c'"''  (X)  =  0.  |0^  ]  ,  i,j,  =  l,...k,  i  ^  j, 

where  o'"(x)  is  any  estimator  of  0.   This  formulation  has  no  very 
useful  extension  to  typical  estimation  problems  in  which,  fcr 
example, n  is  an  interval,  and  in  which  the  event  0"(X)  =  0  exactly 
has  typically  negligible  probability  and  little  interest. 

The  case  in  which  H.  is  any  set  of  real  numbers,  for  example  an 
interval,  and  yC^)  =  ^,   r^iay  be  terned  the  central  problem  of  theory 
of  point-estimation,  although  very  important  generalizations  of 
this  problem  have  been  treated  extensively.   For  this  problem, 
closeness  of  C"'  to  Q   has  been  specified  by  the  introduction  of 
specific  loss  functions:   The  absolute  error  criterion,  |fi"-Ol, 
was  introduced  by  Laplace.   Gauss  replaced  this  by  the  squared   -••• 
error  criterion  (O'-G)   which  proved  nathemo tically  much  more  tract- 
able and  provided  a  definite  formulation  of  the  problem  which  seemed 
equally  reasonable.   A  generalized  squared  error  criterion, 
c(fi).(fl  -fe)  ,  where  c(0)  is  any  specif  lee'  positive  function,  is 
used  in  some  work  in  modern  statistical  decision  theory.   Such 
criteria  are  sometimes  used  in  conjunction  with  the  requirement  of 
unbiasedness ,  E(Q"(X)|Q)  =  Q',    this  is  done  (evidently  primarily  to 
facilitate  mathematical  developments)  particularly  in  the  theory 
of  linear  estimation  due  to  Gauss;  this  reduces  the  mean  squared 


k 

error  criterion  to  a  criterion  of  variance:   E[ (Q'-O)  |0]  E 

'"' 
Var(P  |fi),   (For  a  brief  account  of  the  history  of  the  theory  of 

point  estimation,  cf,  Neyrnan  [2],  pp.  9-lU  • ) 

Each  such  definite  specification  of  closeness  can  be  criticiz- 
ed as  sonewhat  arbitrary,  except  in  a  context  where  one  postulates 
the  reality  of  the  indicated  costs  of  errors  of  each  possible  kind. 
To  avoid  such  features  it  is  evidently  necessary  and  sufficient  to 
adopt  the  following  weak  specification  of  closeness:   If  Q'^<Q"^Q 
or  if  '^■g'^2'^^'-^,    the  estimate  Op  is  called  closer  than  or  to  0;  if 
Q'.    <  Q  <   Q"   no  comparison  as  to  closeness  is  to  be  made.   (The 
latter  point  was  put  forth  by  Galileo  in  an  exchange  which  retains 
interest  in  connection  with  questions  of  formulation. of  estimation 
problems,  particularly  distinctions  between  errors  of  inference 
and  economic  valuations,  and  the  historical  origins  of  unblasedness 
criteria.   Cf.  [3].) 

This  specification  of  closeness  leads  to  comparisons  betvjeen 
estimators  on  the  basis  of  all  of  their  probabilities  of  errors  of 
over-estimation  and  under-estimation  by  various  amounts  d=  |c"-0|; 

..,        f  F(u,0,C!'"')  ~  Prob  I  C'- (x)^  u|g)  for  u  <  C, 
a.  (u  C  O"  )  =      "{ 

[^   l-F(u-0,o,Q'"')  =  Prob{  e"(X)  ^  ulo}for  u  >  ©. 

That  is,  estimators  are  compared  only  on  the  basis  of  their  complete 

cumulative  distribution  functions  (c.d.f's.)  P(u,P,Q")  for  each 

0  e  -^  ,  rather  than  on  the  basis  of  certain  "sum^ -aries"  (f  unct  ionals ) 

of  these  c.d.f's  such  as  mean  squared  error.   The  function 

a(u,Q,0'"),  defined  for  any  estimator  0"(x)  at  each  0  e  fl  and  each 

u  7^  0,  will  be  called  the  risk  curve  of  0"   at  0  (or,  more  precisely, 

of  o'''(,)  at  Q). 
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The  family  of  distributions  under  consideration  may  be  viewed 
as  having  a  parametric  structure  only  in  the  sense  that  it  is  order- 
ed by  the  labeling  of  each  function  f (x,0)  of  x  by  a  different  real 
number  C.   Prom  this  standpoint,  the  problem  of  estimating  0   is 
equivalent  to  that  of  estimating  y  =  y('?)  if  the  latter  is  any 
specified  strictly  monotone  function.   The  formulation  adopted 
above  is  clearly  unaffected  b^r  (invariant  under)  such  transfor- 
mations of  the  parameter  space  (^.'^-»y(-^)  =  P)  j  as  contrasted 
with  some  other  formulations  referred  to  above. 

A  theory  of  point  estimation  based  on  this  broad  formulation 
seems  appropriate  for  typical  problems  of  inference  occurring  in 
empirical  research,  since  various  kinds  of  errors  of  inference  and 
their  probabilities  admit  simple  direct  interpretations,  whereas 
other  formulations  Introduce  specifications  akin  to  costs  of 
various  errors  which  seem  somewhat  hypothetical  or  arbitrary  in 
such  situations.   The  present  theory  also  has  theoretical  and 
technical  relevance  for  estimation  theories  based  on  more  restric- 
tive formulations,  since  it  Includes  such  theories  in  a  formal 
sense  which  will  be  elaborated  in  a  following  section. 

2.  Admissible  point  estimators.  An  estimator  0"(x)  of  0  is  natur- 
ally considered  a  good  one  if  Its  error-probabilities  are  suitably 
small,  i.e.  if  (the  ordinates  of)  its  risk  curves  a(u,0,c'),  for 
each  0  e  i>Land  each  u  j^  Q,    are  suitably  small.  This  leads  to  a 
natural  partial  ordering  of  estimators,  under  which  some  but  not  all 
pairs  of  estimators  can  be  compared.   As  a  basis  for  systematic 
evaluations  and  comparisons  of  estimators  we  require  the  following 
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Definitions  ;  x^or  a  given  estimation  problem,  an  estim:-^,tor  C"  is 
called  at  least  as  good  as  an  estimator  0""if  a(u,©,0")  <a(u,0,0"") 
for  all  o  e  H.  and  all  u  j^  &.   If  Q"   and  Q"'"   are  each  each  at  least 
as  the  other,  then  a(u,Q,e'"')  =  a (u,0, ©'"""") ,  and  the  estimators  are 
called  equivalent.  If  neither  of  c'"",  o"'"""  is  at  leost  as  good  as 
the  other,  the  two  estimators  are  called  not  comparable.  If  0"  is 
at  least  as  good  as  0"""'   and  if  a(u,0,0""')  <  a  (u,e,©'"''  )  for  some 
0  e  Ti  and  some  u  j^  0,    o'"'   is  called  better  than  0" " ,     As  estimator 
0'^  is  called  admissible  if  no  other  estimator  is  better  than  0". 
The  class  of  admissible  estimators  is  called  the  admissible  class. 
A  class  of  estimators  is  called  complete  if,  for  each  estimator 
outside  the  class,  there  is  a  better  one  in  the  class.   The  minimal 
(smallest)  complete  class .  if  one  exists,  coincides  with  the 
adraissible  class.   A  class  of  estimators  is  called  essentially 
complete  if,  for  each  estimator  not  in  the  class,  there  is  one  at 
least  as  good  in  the  class.   A  minimal  essentially  complete  class, 
if  one  exists,  is  a  subclass  of  the  admissible  class. 

The  above  definition  of  admissibility  was  included  in  a  list 
of  criteria  for  point  estimators  by  Savage  [I|.]  (pp.22i4.-225) ,  but  it 
has  not  previously  been  used  systematically. 

The  criterion  of  closeness  of  estimators  introduced  b/  Pitman  [53 
also  deals  with  the  full  c.d.f's.  of  estimators,  in  the  form  of 
the  joint  distribution  of  each  pair  of  estimators  being  compared} 
however  this  criterion  does  not  give  a  partial  ordering  of  estimators, 
and  does  not  lend  itself  to  our  present  purposes. 

For  the  probabilities  of  under-estimation  and  over-estimation, 
we  define  also 


.riOTC! 


4  C;-r,  Co 


"(or- 


o       ',  1  rf  L,» 


i  -■  '.  r  ^. 


a(0-, Q,q"''')  =  Prob  [o'""(X)  <  P|o}=  Lim  a  (o_e  jO,  o"'^) , 

e  -^  0, 
e  >   0 

a(0+^Q,o'")  =  Prob  (q"(X)  >  o|oj  =  Lim  a(Q+e;  e,o"""). 

e  -^  0, 
e  >  0 

For  formal  convenience,  we  also  define  a{0,^ ,Q" )    =  0. 
When  reference  to  a  given  estimator  Q"  is  understood,  we  may  write 
simply  a(u,P),  a(0-,Q),  or  a(P+,Q).   The  functions  a(0-,0)  and 
a(C+,C)  of  0  play  a  useful  technical  role,  and  will  be  called 
respectively  the  lower  and  upper  location  functions  of  O", 

In  many  problems,  estimators  for  which  Prob  [o"(X)  =  o|oj>  0 
for  some  0   are  found  not  useful.   The  remaining  estimators  have 
continuous  c.d.f's,,  and  have  a(0-,fi)  e   l-a(Q+,0).   No  two  such 
estimators,  having  different  location  functions,  can  be  comparable  J 
for  a(Q-,0,Q""')  <  a  ( P- ,  0 , 0"''""' )  is  equivalent  to  a  (0+,O,o'"")  >  a(P+,P,P'' 
this  shows  that  neither  ebtirr^-tor  is  at  least  as  good  as  the  other. 

The  broad  and  "weak"  definition  of  admissibility  adopted  here 
leads  to  very  large  admissible  classes  in  typical  problems,   Hovjever 
it  does  not  seem  unreasonable  to  conceive  of  the  problem  of  point 
estimation  as  one  in  which  the  investigator  chooses  an  estimator  on 
the  basis  of  consideration  of  the  risk  curves  of  all  estimators  in 
some  essentially  complete  class.   In  principle  this  consideration 
should  be  complete,  but  of  course  the  practical  counterpart  of  this 
can  be  at  most  a  more  or  less  extensive  f ai/iliarity  with  an  essen- 
tially complete  class,  developed  by  study  of  the  risk-curves  of  a 
variety  of  specific  estimators,  possibly  strengthened  by  some 
general  theoretical  considerations  (Including  envelope  risk-curves, 
discussed  below)jand  perhaps  also  by  reference  to  one  or  several  loss 
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functions    and   criteria    of   optinality  which  may  seem  more    or   less 
appropriate    in  specific   applications.      Such  an   approach   is    not   so 
difficult    to  carry   out   as   might   be   anticipated,    as   vjill   be    illus- 
trated.     Of   course   difficulties    of   coiiiput^.tion   or   complexity  may 
sometimes   dictate   that   an   inadmlssable   estimator  must   be   adopted; 
even   in  such  cases,    the   most   general  basis    on  which  any  particular 
estimator  might   be    justified   as   not   too   inefficient,    is    evidently 
the   comparison   of    its   risk-curves   with  those   of   other   estimators, 
especially   admissible   ones. 

Example.    Let  X  be   normally  distributed   with  unknovjn  mean  C 

r 

and   variance    1,   i^Jithil=   \^\    -co  <   C  <oo}  .      Consider,   when   2  =  1, 
the  risk  curves    of   the   classical  estimator  0(x)    =  x,    and    of   the 
estimators    0"(x)    =  x   +   1   and    C"(x)    =   +oa      We   have 

a(u,l,0)    ={  l(u-l)  for  u  <   1,    and 


I    1   -  f  (u-1) 

_1     ^        _  v' 
|(v)   =    (2ti)   ^    J  e   ^     dv. 


for  u   >  1, 


where  ,  2 

4  .V 


for  u  <  1, 
for  u   >   1, 

for  u  <  1, 

for  u  >  1. 

Our  wishful  goal   in  choosing   an  estimator  would   be   to  minimize 
simultaneously   all    ordinates    of   such  curves,    for   a  11   0  and   all 
u  7^  0,    since   each  ordinate    is    the   probability   of  an  error.      Of 
course   this   goal  cannot   be   realized    in  non-trivial  problems.      The 
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estimator   0      is   superior   to   0  with  respect   to  all   errors    of  under- 
estiraationj  but  worse  with  respect   to   over-es timationo      Prom  this 
standpoint   neither   can  be   called  better   than  the   other;    they  are 
not   comparable.      The   apparently  trivial   estimator    0" "    (but   no 
"smaller"    one)    is   perfect    in  avoiding   errors   of  under-estlmation, 
but   is ■ as   bad   as    possible  with  respect   to  over-estimation. 

It  will   be   seen  below   th?t   each  of    these   estimators    is   not 
only  admissible   but   that   each  has,    rmong   all   estimators   VJith  the 
same   location  functions,   uniformly   smallest  risk  curves. 

In  most   decision-theoretic    formulations    of  statistical  problems 
a  real-vrlued   risk  function  r(0,-")    is   defined   for  each  parameter 
point   and   each  decision  function.      In  the   present   formulation,   we 
associate  with  each  pair   Q,    O"   a  set    of   error-probabilities 
a(u,0,0    ),   u  f   0.      These   respective   error-probabilities,   for   each 
fixed   C  and   O",   may  be  regarded    as   components    of   a  vector   denoted 
by  r(C,C")    =  ^a(u,5,©")J-  ,    the   components   a(u,P,0")    having    index  u. 
Then  r(C,0")    is   an   example    of   a  vector-valued   risk  function. 

Knowledge   of   the   admissible  class    or   of  an  essentially  complete 
class    of   estimators    in   the   present   broad   sense   can  be  useful    in 
applying   other  formulations    of   the   estimation  problem,     Por   example, 
every  estimator  which  is    admissible  with  respect    to  a   squared  error 
loss   function  must   clearly   be   admissible    in   the   present   sense;    hence 
the   search  for   estimators   good    in   the   former  sense   can  be   restricted 
without    loss    to  any   class   known  to  be   essentially  complete    in  the 
broader   sense.      In  this   way,    a   hierarchy   of  definitions    of   admissi- 
bility  leads   to  a   corresponding   nested   hierarchy  of  admissible   or 
essentially  complete  classes    of  estimators.      (The   latter  concepts. 
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Figure   1 
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and  that  of  vector-valued  risk  functions,  x-jere  introduced  in  other 

contexts  by  L.  V/eiss  [6].) 

3,  Admissible  confidence  lin^its.  If  0"  =  Q"(x)  is  a  point  estimator 

of  Q  in  a  specified  problem,  x^ith  the  property  that 

Prob  [P"(X)  <  c  I  0}  =  a(0-,Q,e")  is  relatively  small  for  all  o, 

then  e"  is  an  upper  estimator  of  C,   In  particular,  if  a  (C'_  .C,0"  )=:a 

for  all  0,  then  C  is  an  upper  confidence  limit  with  confidence 

coefficient  1  -  a,  or  an  upper  (1-a)  confidence  limit.   Typically 

a  value  (l-a)».5  is  chosen. 

The  typical  use  and  interpretation  of  an  upper  estimate  is 
the  follov)ing:  When  a  given  numerical  value  (observed  value)  is 
obtained  by  use  of  an  urper  estiraator,  this  is  taken  as  evidence 
supporting  the  conclusion  or  decision  that  the  true  unknown  value 
is  at  least  as  small  as  the  estimated  value.   Hence  the  merits  of 
any  upper  estimator  depend  upon  the  following  considerations,  in 
suitable  combination: 

(a)  The  probability  should  be  suitably  high  that  the  indicated 
conclusions,  of  the  form:  "d  is  not  greater  than  0"(x),"  are  correct 
for  each  possible  true  value  of  0,   That  is,  the  confidence  coeffi- 
cient should  have  a  suitably  large  valuej  or,  more  generally,  the 
lower  location  function  a(0-,c,c")  should  have  suitably  low  values 
for  all  0,     Such  properties  are  sometimes  referred  to  by  the  term 
validity,  particularly  in  the  case  of  confidence  limit  estimators | 

a  valid  (1-a)  upper  confidence  limit  estiraator  is  one  which  does  in 
fact  have  the  property  that  Prob  {o"""  <  0 1 0  j  =  a  for  all  0  e  A  , 

(b)  Given  that  one  of  the  indicated  conclusions  ("Q  g  C!"(x)")  is 
correct,  it  should  be  as  strong  and  informative  a  conclusion  as 
possible;  hence  for  each  possible  true  value  of  Q,    the  conditional 
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distribution  of  0"(X),  given  that  O  s  ^"(X),  should  be  concentrated 
as  close  to  C  as  possible.   That  is,  given  the  location  function 
a(0-,0,0")  of  any  upper  estimator  0",  for  each  0  and  each  u  >  O 
the  values  a(u,P,Q")  =  Prob  [Q"(X)  ^  u|o]  should  be  suitably  small. 
Such  properties  of  confidence  limits  have  been  termed  accuracy 
properties  by  Lehmann  [?],  p.78«   I'ore  generally,  in  the  theory  of 
confidence  region  estimation,  such  properties  have  been  termed 
shortness  properties  by  Neyman  [8]  • 

(c)   Given  that  one  of  the  indicated  conclusions  ("c  ^  0"  (x)"  )  is 
incorrect  (i.e.  that  in  fact  Q  >   ^"(x)),   the  indicated  conclusion 
should  be  misleadinr^  in  the  smallest  possible  degree.   For  example, 
in  any  given  problem,  under  any  given  true  value  of  Q,  when  an 
upper  estimator  takes  a  value  two  units  below  the  true  value,  the 
indicated  conclusions  (or  inferences  or  actions  or  decisions)  are 
at  least  as  erroneous  (or  inappropriate)  and  in  general  more  so, 
than  when  an  upper  estimator  (vjith  the  ssme  confidence  coefficient 
or  location  function)  takes  a  value  which  is  only  one  unit  below 
the  true  value.   That  is,  given  the  location  function  a(0-,0,Q"), 
for  each  0  and  each  u  <  0  the  values   a(u,Q,Q")  should  be  suitably 
small.   This  property  has  evidently  not  previously  been  discussed 
along  with  those  oJ  validity  and  shortness,  but  it  seems  necessary 
to  include  it  for  a  complete  specification  of  the  practical  purposes 
and  intuitive  goals  of  confidence  limit  estimation.   All  three 
properties  are  given  some  weight  in  a  specific  loss  function  adopted 
in  the  decision-theoretic  treatment  of  Wolfowitz  [9]» 

These  considerations  lead  in  the  usual  way  to  definitions  of 
admissibility  and  of  complete  classes  of  upper  and  lower  estimators. 
Properties  (b)  and  (c )  together  are  formally  Identical  with  the 
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closeness  properties  considered  in  the  preceding  section  for  point 
estimators,  while  property  (a)  by  itself  is  merely  descriptive  of 
the  location  function  of  a  point  estinator.   Thus  every  admissible 
confidence  limit  estimator  is,  formally,  an  admissible  point 
estimator  as  defined  above,  and  is  contained  in  every  complete 
class  of  point  estimators. 

Hence  there  is  no  necessary  formal  distinction  between  the 
formulations,  theories,  and  practical  techniques  of  point  estimation 
on  the  one  hand  and  of  confidence  limit  estimation  on  the  other: 
the  distinctions  required  here  are  only  those  of  qualitative 
emphasis  and  quantitative  degree  which  reflect  the  variety  of  possi- 
ble purposes  for  which  a  point  or  confidence  limit  estimator  may 
be  chosen  from,  say,  an  essentially  complete  class.   For  example, 
in  choosing  an  upper  estimator  for  a  given  application,  it  rnay  be 
judged  that  property  (c)  above  should  be  given  no  weight  as  com- 
pared with  properties  (a)  and  (b)  because  "a  miss  is  as  good  as 
a  mile"  in  the  given  context  of  application;  in  other  contexts, 
including  probably  most  cases  of  estimation  for  informative 
inference,  some  weight  may  be  given  to  each  property, 
U»   Admissible  interval  estimators.  If  J  =  J(x)  =  (0»,G")  =  (C«(x), 
©"(x))  is  a  pair  of  point  estimators  such  that  P' (x)  ^  0"(x)  for 
each  X  in  S,  then  J  is  an  interval  estinator  of  0.   In  partiuclar, 
if  Prob  -^0'  (X)  ^  0  $  0"  (X)  \o]  =   1-a  for  each  Q,    then  J  is  a  con- 
fidence interval  with  confidence  coefficient  1-a,  or  a  (1-a) 
confidence  interval,  (Typically  a  value  (1-a)  »,5  is  chosen,)   The 
typical  use  and  interpretation  of  an  upper  estimate  is  the  following: 
When  given  numerical  values  O'  m  d  O"  are  obtained  by  use  of  an 
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Interval  estimator,  this  is  taken  as  evidence  for  the  conclusion 
that  the  true  unknown  value  of  the  parameter  C  lies  in  the  closed 
interval  [Q\Q''  ], 

The  probability  properties  of  any  interval  estimator  J  may  be 
described  in  the  following  terms:   It  is  natural  to  call  a{0-,0,0") 
the  lower  location  function  of  J  (as  vjell  as  of  0"),  and  to  denote 
it  when  convenient  by  a(Q-,Q,J)j  similarly  a(0+,©,J)  s  a(0+,0,0») 
is  the  upper  location  function  of  J,   As  with  point  estimators, 
these  functions  give  respectively  the  probabilities  of  under- 
estimation and  of  cverestimation  vjhen  a  given  interval  estimator  J 
is  used.   For  exam,ple,  it  is  natural,  to  call  J  a  med ian-unbias ed 
interval  estimator  if  for  each  ©  we  have  equal  probabilities  of 
cverestimation  and  underestirra  tion:   a(0-,0,J)  =  a(0+,0,J).   This 
usage  is  compatible  with  the  definition  of  a  median-unbiased  point 
estimator. 

A  quantity  of  primary  interest  is  the  probability  that  the 
conclusion  indicgted  by  any  interval  estimator  J  ("C  lies  in 
[Ot^p"]")  will  be  incorrect,  for  each  possible  true  value  Q,   This 
probability  is  just  the  sum  of  the  locstion  functions  of  J: 
Prob  [o  not  covered  by  J(X)|o]=  Prob  [o"  (X)  <  Q|o} 
+  Prob  {o(X)  >  e|0  }  =  a(0-,e,J)  +  a(0+,©,J). 
If  this  probability  equals  a  for  each  0,  then  J  is  a  (l-ci)  confi- 
dence interval;  if  in  addition  J  is  median-unbiased,  then  0'  and 
P"  are  (l-'^a)  confidence  limits.  As  with  point  aid  confidence  limit 
estimators,  it  is  of  interest  in  general  to  consider  the  probabili- 
ties of  errors  of  under-estimation  and  of  over-estimation  of  various 
magnitudes  in  interval  estimation;  we  denote  these  probabilities  by 
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a(u,©,J)  =  ra(u,e,Oi)  for  each  u  >  0, 
la(u,e,C")  for  each  u  <  Q, 

In  a  formal  sense,  a  point  estimator  may  be  regarded  as  an 
intervol  estimator  J  =  (Q' ,  'P"  )  having  the  specie!  form:  0' (x)  = 
0" (x)  for  all  X,   The  full  specification  of  what  is  meant  by  a  good 
point  estimator  i^'\    by  use  of  the  risk  curves  a(u,0,Q"),  corresponds 
to  the  use  of  the  functions  a(u,Q,J)  to  specif^'-  at  least  part  of 
what  is  meant  by  a  good  interval  estimator  J. 

Ap;ain,  in  a  formal  sense  an  upper  estim'^tor  0"  (x)  may  be 
regarded  as  an  interval  estimator  J  =  (i^'jC!")  having  the  special 
form:   ^' (x)  =  0     =  the  greatest  lov.'er  bound  of  il. ,  for  al]  x.   The 
full  specification  cf  what  is  meant  b;  a  good  upper  estimator  C" , 
by  use  of  the  risk  curves  a(u,Q,P"),  corresponds  to  part  of  what 
is  meant  by  a  good  interval  estimator j  in  particular,  small  values 
of  a(u,C^C")  for  u  >  0,  which  indicate  desirable  properties  of 
accuracy  or  shortness  for  an  upper  estimator  0" ,    indicate  corre* 
spending  shortness  properties  for  an  interval  estimator  J  =  (C',fi"). 

The  merits  of  any  interval  estimator  J  depend  upon  the  folloxij- 
inj?  considerations  in  suitable  combination. 

(a)   The  probability  should  be  suitably  high  that  the  indicated 
conclusions  ("P  lies  in  [0',  P"  ]")  are  correct,  for  each  possible 
true  value  of  0.   That  is,  the  confidence  coefficient  should  have 
a  suitably  high  valuej  or,  more  genera: ly,  for  each  0,  the  sum  of 
the  location  functions  a(P-,0^j)  and  a(G+,Q,J)  should  be  suitably 
low,   As  with  point  estimators,  it  seems  desirable  to  avoid,  as  far 
as  possible  and  convenient  in  the  development  of  a  general  theory, 
any  step  which  corresponds  to  a  tacit  judgment  that  errors  of  over- 
estirnation  and  underestimation  are  necessarily  comparable  either 
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qualitatively  or  quantitively .   Hence  the  present  specification 
will  be  given  the  form;   Each  of  the  location  functions  a(0-,0,J), 
a(0+^0,J)  should  have  suitably  small  values,  for  each  0, 

(b)  Given  the  location  functions  cf  an  interval  estimator  (and, 
hence,  given  the  probability  l-a(C-,C^j)  -  a(Q+,Q,J)  of  correct 
conclusions,  for  each  Q) ,    the  indicated  conclusions  should  when 
correct  be  as  strong  and  inforraative  as  possible.   That  is,  for 
each  fi,  the  conditional  distributions  of  P' (X)  and  Q"(X),  given 
that  ^(X)  <  ^  ^  ^"(X),  should  be  concentrated  as  close  to  0   as 
possible.   (In  terins  cf  the  conditional  bivariate  distribution  of 
(Ot(X),  0"(X),  this  means  concentration  close  to  the  point  (^,0).) 
These  desirable  shortness  properties  of  J  correspond  to  suitably 
small  values,  for  each  G,    of  a(u,c,£")  for  each  u  >  Q  and  of 
a(u,O^C>  )  for  each  u  <  0» 

(c)  Given  that  one  of  the  conclusions  indicated  by  J  is  incorrect, 
it  should  be  misleading  in  the  smallest  possible  degree.    (The 
remarks  on  property  (c)  of  the  preceding  section  are  also  applicable 
here.)   These  desirable  closeness  properties  of  J  correspond  to 
suitably  small  values  of  a(u,Q,J)  for  each  Q   and  each  u   /  ^i    that 
is,  suitably  small  values  of  a  (u,  0,01)  for  u  >  0  and  of  a(u,0,C") 
for  u  <  C. 

To  represent  all  of  the  properties  considered  for  interval 
estimators,  we  define  the  risk  curves  of  each  interval  estimator 
J  =  (0»,c"),  at  each  6,  as  the  ^iv   of  functions  [a(u,0,Oi), 
a(u,0,C")]  of  u(u  7^  Q),  i.e.  the  risk  curves  of  Q'  and  of  0",   Thus 
the  risk  curves  of  J  at  0  are  a  representation  of  the  bivariate 
cumulative  distribution  function  of  d  (x)  and  0"(X)  when  0  is  true. 


17 

These  considerations  lead  us  to  formulate  the  following  basic 
definitions !  An  interval  estimator  J  =  (O',©")  will  be  called  at 
least  as  good  as  another  J"  =  (©",Q"'")  if  ©'  is  at  least  as  good 
as  6"  and  ©"  is  at  least  as  good  as  Q""    in   the  sense  defined  for 
point  estimators  in  Section  2  above*,   Similarly,  J  will  be  called 
better  than  J"  if  it  is  at  least  as  good  as  J"  and  also  ©•  is 
better  than  Q"   and/or  ©"  is  better  than  ©""«   J  will  be  called 
admissible  if  no  other  interval  estimator  is  better.   Complete 
classes  are  defined  in  the  usual  wayo 

If  two  interval  estimators  have  different  location  functions, 
they  are  not  comparable  (neither  is  at  least  as  good  as  the  other )j 
this  follows  immediately  from  the  corresponding  property  for  point 
estimators,   A  simple  sufficient  condition  for  admissibility  of 
J  =  (©',©")  is  that  Q'  and  0"  be  admissible  point  estimators* 
$»   Confidence  curve  estimators.   The  selection  of  an  estimator  of 
one  of  the  above  kinds  for  purposes  of  informative  inference. 
Including  typical  applications  in  scientific  research,  is  generally 
admitted  to  involve  elements  of  choice  which  are  in  some  degree 
arbitrary.   Such  elements  Include  the  choice  of  a  particular 
confidence  level  for  an  interval  estimator,  and  the  choice  of 
location  functions  for  an  interval  estimator  with  given  confidence 
coefficlente   In  addition,  a  point  estimate  is  sometimes  desired 
along  with  an  interval.   Such  considerations  and  related  ones  have 
led  to  proposals  for  use  simultaneously  of  a  point  estimator  and  a 
set  of  confidence  limit  or  interval  estimators  having  various 
confidence  coefficients.   Such  estimators  may  be  regarded  as  a 
modern  formulation  of  a  long-standing  practice  of  reporting 
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estimates  in  the  form  ©'"  1  k  <Ta-k-,  where  k  is  some  constant  and 

Z 

cr-  -;;-  =  Var  (0'~(X)),   The  latter  form  may  be  interpreted  as  an 

ordered  set  of  three  point  estimators.  For  example,  if  Q"(X)  has 
a  normal  distribution  with  a  known  constant  variance,  and  k  =  1, 
then  the  "estimator"  ©"(x)  t   kd^---  may  be  written  as  the  ordered 
set  of  estimators 

[0''(x)  -cr^-;:-,  d'"(x),9'''(x)  +  o^-l  =  [Q(x,«81|),0(x,c5),  ©(x,ol6)], 


Estimates  of  this  "omnibus"  kind  can  be  interpreted  flexibly  but 
validly,  in  any  context  of  application  for  informative  inferences, 
in  the  ;i?ays  customary  for  (a)  point  estimates  such  as  8(x,«'5)* 
(b)  confidence  limits  such  as  ©(x,<»8ij.)  and  0(x,»l6),  and  (c)  con- 
fidence intervals  such  as  [Q(x,»8[|.),  0(x,e.l6)]« 

Tukey  [10]  proposed  that  for  typical  general  purposes  it  wo\ild 
be  advantageous  to  use  a  set  of  five  point  estimators  at  standard 
levels:  e(x,c),  with  a  =  z\%  ,   I6|°y6  ,  5o  °/6  ,  83^°/^  ,  and  97|°y6  . 
Cox  [11]  proposed  use  of  the  full  continuous  family  of  confidence 
limits  ©(x,a),  0  ^  a  g  lo   Such  an  omnibus  estimator  Includes 
formally,  as  elements,  not  only  confidence  limits  at  all  levals 
and  a  median-unbiased  point  estimator,  but  also  median-iinbiased 
confidence  Intervals  at  all  levels*  Whether  such  estimators  should 
be  used  in  practice,  rather  than  more  standard  methods,  is  a  matter 
of  judgment  and  taste  which  can  perhaps  be  decided  best  in  specific 
contexts  of  application.   It  is  often  convenient,  as  will  be 
illustrated  below,  to  discuss  estimation  theory  and  techniques  for 
estimators  of  this  omnibus  form,  since  such  discussion  includes 
conveniently  and  compactly  a  treatment  of  estimators  of  the  various 
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kinds  mentionedo 

Any  such  estimator,  consisting  of  a  specified  set  of  confidence 
limit  estimators  P(x,a),  a  in  some  specified  subset  of  the  closed 
unit  interval  (possibly  the  whole  interval),  ordered  in  the  sense 
that  a  <  a«  implies  0(x,a)  >  Q(x,af)  for  each  x  in  S,  will  be 
called  a  confidence  curve  estimator,   V/e  shall  usually  consider 
the  inclusive  case,  0  <  a  ^  1,  so  as  to  include  formally  all  other 
cases*   In  many  problems  it  is  convenient  to  give  such  estimators 
a  form  which  can  be  reported  graphically:  if  for  each  x  e  S,  e(x,,a) 
increases  continuously  from  ©  to  o  as  a  decreases  from  1  to  0,  then 
we  define  the  confidence  curve  estimator  c(Q,x),  for  each  x  e  S, 
as  the  continuous  cvirve  (function  of  o  £  TL ) 

c(9,x)  =  min  [a,l-a j©(x,a)  =  ©]  • 

For  example,  if  X  is  normally  distributed  with  unit  variance  and 
mean  0,  then  the  confidence  curve  estimator  of  Q   is 

'"^{Q   -  x),   -oo  ^  ^  ^  X, 
c(o,x)  = 

1  -  j|(o  -  x),  X  g  P  g  go; 

for  any  observed  value  x,  the  estimate  c(©,x)  can  be  described  by 
a  more  or  less  complete  sketch  of  its  graph  when  convenient*   Such 
estimates  are  Illustrated  in  a  number  of  examples  in  Section  9 
below. 

The  definitions  of  admissibility  and  of  complete  classes  for 
confidence  curve  estimators  parallel  those  above  for  confidence 
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Interval  estlitiatorfi*     A   simple    sufficient   (but  not,    in  general, 
necessary)   condition  that   a  confidence   c\jrve   estimator  be   ad- 
missible  is   that   for  each  a,    its  element  £J'  (x,a)  be   an  admissible 
point  estimator*      In  problems  for  ^^^hich  there   exists  a  uniformly 
best   confidence   limit  estimator   for  each  confidence   coefficient, 
this   condition  is  necessary  as  well  as   sufficient,   and   there    is  a 
unique    (aoeo)   admissible   confidence   curve   estimator  which  consists 
simply   of  the   family   of   these   best   confidence   limit  estimatorse 
69      Elementary   theory   of  admissible    point  estimatorsc      An  important 
part   of  the  general   theory   of  admissible  point  estimators,    and   of 
corresponding  practical   techniques   of  estimation,   can  be   developed 
conveniently  by  an  essentially  elementary  use    of  the    theory   of 
tests   of  one-sided  hypotheses   as   originated  by  Neyman  and   Pearson 
and  as   extended   (by   simple  use   of  their  Piondamental   Leinma )    to 
generate   a  variety  of  admissible   tests   of   such  hypotheses.      In 
problems   for  v/hich  uniformly  best   one-sided  tests  exist,   the    com- 
plete  theory  of  admissible  estimators    is   obtained   in  this  way;   for 
other  problems,   the   development   of  the  remaining  parts   of  the 
theory  requires  more    general  methods   introduced  in  Section  10  belovj# 

For  each  Q     in  /L,  we   consider  two  one-sided  testing  problems: 
(a)    the   problem  of  testing  the   hypothesis  H( 0    ) :   ^  ^  O      (against 
the   general  alternative   H'(Q    ):    Q  >  ^q)j  and   (b)   the    problem  of 
testing  H(©  -):    Q  <  Q      (against  the  general  alternative   H'(P  -): 

9  >  Q^)o      In  case   Q^   is   a  minimum  value    in  J~l,   consideration  of 
—     00  ' 

H(O-)    is   to  be    ommitted;    if  Q     is   a  maximvim  in  JX,   H(0    )    is   omitted. 

Any  given  point  estimator   Q"  =  o"(x)    of  Q  can  be   used   in  the 
following  V7ay  to  define   a  test   of  each  of  the  hypotheses  mentioned: 
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Accept  the  hypothesis  if  and  only  if  the  observed  value  ©'"(x)  is 
consistent  with  the  hypothesis.  Such  a  test  of  the  hypothesis 
H(0  )  has  the  acceptance  region  A(0  )  =|xl©'(x)  ^  q7   j  such  a 
test  of  H(0  -)  has  acceptance  region  ACQ^-)  =Sx|o""'(x)  <  ^q^  • 
If  Q^  <  O^,    then  A(0^-)  c  A(O^)  C  A(©^-)  <C  A{Q^) -^    for  brevity,  we 
shall  say  that  such  a  sequence  of  sets  A(0)  is  nondecreasing  in  ©, 
with  the  understanding  the  argument  0  may  take  a  value  (©-)  which 
is  considered  smaller  than  Q   and  larger  than  ©-£  for  each  positive 


Such  a  test  of  H(P  -)  has  probabilities  of  errors  of  Type  I 
given  by 

1  -  Prob  (A(P  -)|e)  =  a(0^,9,e''")  for  each  Q  <  Q^, 

and  of  Type  II  given  by 

Prob  (A(0^-)|9)  =  a.{Q-,OQ-')   for  each  ©  >  ©^  » 
O  O  '  "^  —   o 

Such  a  test  of  H(©  )  has  probabilities  of  errors  of  Type  I  given 

by 

1   -   Prob    (A(e^)[0)   =  a(©Q   +  ,©,©""')    for  each  ©  ^  ©^   , 
and  of  Type    II  given  by 

Prob    (A(0^)|©)   =  a(©Q, ©.,©'")   for  each  ©  >  ©^      • 
Thus  each  of  the   error-probabilities  a(u,©^©'"),   upon  which  depend 
the   admissibility  of  any  given  point  estimator  ©",   appears  as   an 
error-probability  of  a   test   of  a  one-sided  hypothesis  based  upon 
use   of  ©"«      These   relationships   provide   the   following  simple 
sufficient   condition  for  admissibility  of  a  point  estimator* 
Lemma  1,      For  any   specified  family   of  probability  density  functions 
f(x,©)    (with  respect   to  an  vmderlying  cr-  finite  measure   \i{x)   defined 
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on  the  sample  space  S  ={x'V  ),  ©  e  J~L(a  subset  of  the  real  line), 
a  given  estimator  ©'"  =  Q"(x)  (any  measurable  function  taking 
values  in  the  closure /V  of  i^ )  is  admissible  if  each  of  the  accept- 
ance regions  A(©  ),  A(©  -),  based  on  Q"   as  defined  above,  gives  an 
admissible  test  of  the  corresponding  one-sided  hypotheses 
H(o  ),  H(o  -)  defined  above. 

Proof:   (A  test  is  called  admissible  if  no  other  test  has  all  error- 
probabilities  at  least  as  small,  with  at  least  one  strictly  smaller*) 
If  ©'"  satisfies  the  assumptions  of  the  Lemma  but  is  inadmissible, 
let  ©'""'  be  an  estimator  better  than  ©"•   Then 

a(©  ,©,©'""'"■)  ^  a(©  p9,0'"")  for  each  ©  £  Aand  each  ©^  7^  «,  and  the 
Inequality  is  strict  for  some  ©  =  ©'  e  iXand  some 

Q     =  ot  e  7\.  ©'  ?^  ©'.  Assume  for  deflniteness  that  ©'  >  ©'  (the 
o    o     ^^   o  o 

other  case  can  be  discussed  in  the  same  way).   Then  the  acceptance 
region  -[x  |©""''"(x)  <  ©'  Igives  a  better  test  of  the  hypothesis  H(©^-) 
than  does  |x|©'~(x)  <  0'  ^^    This  contradicts  the  assvimed  admissi- 
bility  of  the  test  based  on  the  latter  region,  completing  the  proof. 

Many  estimators  of  interest  can  be  conveniently  investigated 
theoretically  and  constructed  practically  by  the  device  of  using  as 
Indicated  below  a  function  v(x,©),  defined  for  each  sample  point  x 
and  each  0  e  /V«   If*  for  each  fixed  c,  v(x,©)  is  a  measurable 
function  of  x,  it  is  a  statistic;  and  as  ©  varies,  v(x,©)  represents 
a  family  of  statistics,  V/e  term  such  a  function  v  a  guasistatistic. 
Corollary  1»  A  sufficient  condition  for  admissibility  of  an  estim- 
ator ©""(x)  is  that  it  be  defined,  for  each  x,  as  the  solution  ©  of 
the  equation  v(x,©)  =  0,  where  v  is  a  quasistatistic  such  that: 
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(a)  For  each  x  in  S,  v(x,©)  =  0  holds  for  a  unique  0   In 7^, 

(b)  If  P,  <  ^2  ^"d  S»  ^2  ^^®  in  A,  then  Ixlv(x,Q^)  ^0^ 
Cirx!v(x,P2)  <  oj . 

(A   simple    svifficient  condition  for    (b)    is   that   for  each  x,   v(x,©) 

be  nonincreasing  in  ©« ) 

(c)  For  each  o  in -T^.,  the  acceptance  regions  yxlvlx^O)  ^  o7  and 
Xx|v(x,ci  )  <  Ol  are  admissible  respectively  for  testing  the  one- 
sided hypotheses  H(Q^)  and  H(0q-). 

Proof:   If  v(x,Q)  satisfies  the  stated  conditions,  the  conclusion 
follows  Immediately  from  Lemma  1  upon  observing  that 

ixlvlx,^^)  g  0  J=  |x|Q'''(x)  ^  Q      sand  |x1v(x,0q)  <  oj  =   (x|a''-(x)  <  Q\ 

When  an  estimator  0^   is  defined  implicitly,  by  use  of  a  quasi- 
statistic  v(x,e),  as  the  solution  Q   of  the  equation  v(x,Q)  =  0,  in 
applications  it  is  not  necessary  to  have  an  explicit  formula  for 
Q"{x)    since  for  any  observed  sample  point  x  it  suffices  merely  to 
determine  the  corresponding  root  ©  of  the  defining  equation;  and  in 
the  cases  of  aany  such  estimators  of  practical  and  theoretical 
interest,  no  explicit  formula  for  0'"(x)  is  available.   The  pre- 
ceding lemma  shows  that  basic  qualitative  properties  of  efficiency 
can  be  established  for  such  estimators  without  use  of  any  explicit 
formula  for  Q" {x) »      Their  quantitive  properties  can  also  be 
determined  without  such  explicit  formulas:   Since  v(x,u)  <  0  Is 
equivalent  to  0"(x)  <  u,  and  v(x,u)  =  0  is  equivalent  to 
0"(x)  =  u,  we  have 
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'Prob  [P"{X)  £  u\Q]    =   Prob  [v(X,u)  ^  0|o]  for  u  <  © 
a(u,©,0"")  = 

'  Prob  [©"''(X)  >  u|«]  =  Prob  [v(X,u)  ^  0\Q]    for  u  >  o. 

Thus  all  quantitative  properties  of  such  estimators  0"  can  be  de- 
termined, when  convenient,  by  determining 
Prob  [v(X,u)  g  0|Q]  and  Prob  [v(X,u)  =  0\Q]    for  each  u  ^  Q. 

Some  theoretical  properties  of  such  estimators  are  also  con- 
veniently treated  in  terms  of  the  Codef »s«  of  v.   For  example,  if 
for  each  n  =  1,  2,,,,,©"  is  an  estimator  determined  by  a  quasi- 
statistic  V  =  v  (x  ,©),  then  the  condition  that  the  sequence  of 
estimators  ©"  be  consistent  (that  is,  that  Lim  a(u,©,0'~)  =  0,  for 
each  ©  e/\and  each  u  ji^   ©),  can  be  stated,  and  in  many  cases  con- 
veniently proved,  in  the  form:  Lim  Prob  [v  (X  ,u)  <  Ol©]  =  0  or  1, 
according  as  u  <  ©  or  u  >  ©,  for  each  Q  e  /I.9 

For  estimation  by  confidence  intervals  or  confidence  curves, 
it  is  sometimes  convenient  to  employ  a  family  of  quasistatistics* 
Suppose  that  for  each  of  several  values  of  an  index  a,  v(x,fl,a)  Is 
a  quasistatistlc  which  determines  as  above  an  estimator  ©(x,a),  and 
that,  for  each  x  in  S,  0(x,a)  is  decreasing  in  a.  Then  for  any 
pair  of  values  of  a,  at  >  a",  the  pair  of  estimators 
[©(x,a'),  ©(x,a")]  =  J(x)  is  an  interval  estimator  of  ©,  whose 
quantitative  properties  may  be  investigated  in  terms  of  the  dis- 
tributions of  v(X,u,a)  as  indicated  above,  and  whose  admissibility 
can  in  some  cases  be  established  by  direct  application  of  Corollary  1 
to  v(x,©,a')  and  v(x,9,a").  A  case  of  interest  is  that  in  which 
a  =  Prob  [v(X,©,a)  g  0|©]  =  Prob  [v(X,0,a)  <  0\Q]    for  each 
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o.f   0  <  a  ^  1,  and  each  Q   e  I\,      Then  the  family  of  estimators 
e(x,a)  constitutes  a  confidence  curve  estimator  of  Q  (assuming 
again  that  v(x,Q,a)  is  decreasing  in  a);  this  estimator  is  admissible 
if  for  each  a  the  quasistatistic  v{x,0,a)  satisfies  the  assumptions 
of  Corollary  lo   Examples  of  such  estimators,  and  of  convenient 
techniques  for  their  computation  and  presentation,  are  given  below, 
7»   Uniformly  best  estimatorso   Let  o"(x)  be  any  estimator  of 
Q   e  -TL,  Q"   will  be  called  a  uniformly  best  estimator  of  Q   if,  among 
all  estimators  with  the  same  location  functions  a(0-,Q),  a(©+,P),  Q" 
has  uniformly  minimum  error-probabilities  a(u,0).   Since  the 
a(u,e)'s  are  error-probabilities  of  tests  of  one-sided  hypotheses 
H(9  -),  li(0  ),  Q  £  /X,  with  respective  acceptance  regions 
A(Q^-)  =  i[x|o'"(x)  <  qI   ,  A(O^)  =i'x|e'"''(x)  ^  qX  ,   a  necessary 
condition  for  o"  to  be  a  uniformly  best  estimator  is  that  f(x,©) 
and  -0_  admit  uniformly  best  tests  of  the  hypotheses  H(©  -),  H(Q  ), 
of  respective  sizes  a(0  -,  ©  ,  Q" ),  1  -  a(P  +,  ©  ,  ©"""),  c  e  _pt-. 
It  is  vjell  known  [12]  that  \iniformly  best  one-sided  tests  of 
all  sizes  exist  if  and  only  if  there  exists  a  sufficient  statistic 
t(x)  v;ith  the  monotone  likelihood  ratio  (m,l,r. )  property,  in  which 
case  each  best  test  may  be  obtained  by  use  of  an  acceptance  region 
of  the  form 

A(o^-)  =  £(x,y)|  2(t(x),y,©^)  ^al©^-,©^)}  or 
A(©o)  =  ■[(x,y)t  z(t(x),y,o^)  ^  l-a(  ©^+,©^)  J.  , 

where  Y  is   the   observed  value    of  a  uiiiforray  distributed  auxiliary 
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randomization  variable  y,  0  £  Y  <  1,  and  Z  is  the  continuous 

probability  integral  transform  of  Y: 

z(t(x),y,a)  =  yF(t(x),Q)  +  (l-y)F(t(x)-,©),  where 

F(t,©)  =  Prob  -[t(X)  ^  t|oZ.   If  such  a  sufficient  statistic  t(x) 

exists,  then  a  simple  sufficient  condition  for  admissibility  of  an 

estimator  Q"   is  clearly  that  Q"   be  a  non-decreasing  function  of 

t(x);  for  then  A  (P^-)  =  Tx  |©""'(  t(x) )  <  ©^|  and 

A(©  )  =  |xlP"(t(x))  g  ^  "l  are  uniformly  best  one-sided  testso   If 

such  a  statistic  t(x)  has  a  discrete  distribution  on  a  subset  of  the 

integers,  then  t(x  )  +  y   is  another  sufficient  statistic  having 

the  monotone  likelihood  ratio  property,  and  having  a  continuous 

c«d.f»  under  each  0;  as  above,  a  simple  sufficient  condition  for 

admissibility  of  an  estimator  Q"   is  that  it  be  a  non-decreasing 

function  of  t(x)  +  y. 

More  generally,  let  0"   be  any  estim.ator,  let 
G(0)  =  Prob  :[«"'■  (X)  g  ©fe  j,  let  G(e-)  =  Prob  ^e"''(X)  <  Q\qI,   let 
F(t,e)  =  Prob  S't(X)  ^  t|Ql  ,  vxhere  t(x)  is  a  siifficient  statistic 
with  the  m,l«r«  property,  and  as  above  let 

z(t(x),y,P)  =  yF(t(x),©)  +  (1-y )F( t(x)-,o) ,   Consider  the  quasi- 
statistic  v  =  v(x,y,©)  =  z(t(x),y,©)  -  G(fi),  For  each  ©  , 
A(©^)  =  JCxjy ) |v(x,y,©^)  <  ol  is  clearly  a  xmiformly  best  accept- 
ance region  for  testing  H(©  )  at  level  1-G(©  )  =  a(0  +,©  ,©'")o 
Consider  the  quasistatistic  v'  =  v'(x,y,Q)  =  z( t(x),y,©)-G{0-) 
g  V  +  [G(©)  -G(Q-)].  For  each  o^,a(0^-)  =  f  (x,y )  [v  »(x,y,©^)  <  ol 
is  clearly  a  uniformly  best  acceptance  region  for  testing  H(©  -); 
at  ©  =  9  it  has  Type  II  error  probability  G(©  -)  =  a(©^-,e^, ©""'), 


(■:•.': 


27 

To  verify  that  these  acceptance  regions  constitute  a  sequence 
of  sets  which  is  nondecreasing  in  0  in  the  sense  defined  in 
Section  6,  we  note  that  obviously  ^(0^-)  o^Ce^),  and  we  proceed 
to  prove  that  P,  <  ©^  implies  A(©^)c  A(©2-):  Assume  that 
(x»,y')  e  A(©^);  but  (x«,y')  ii  k^Q^-)',   then 

z'  =  z(t(x»),y',a^)  <  G(O^)  and  z"  =  z(t(x» ) ,y • ,©2^  ^  Gfe^-).  A 
best  test  of  H(e,)  of  size  (1-z ' )  (the  test  which  rejects  when 
z(t(x),y,©, )  >  z»)  has  maximum  power  at  ©  =  ©2*  namely  1-z";  the 
test  with  acceptance  region  J  x  |©"  (x)  g  '^■A    has  size 

1  -  G(©^)  <  (l-z«)  and  hence  has  power  Prob  £©""(X)  >  '^'^\^-^><   1  "  *" 
Hence  z"  <  Prob  {©"(X)  ^  ©^Is^jg  Prob  -[©'"(X)  <  ©2}  =  ^(©2-),  a 
contradiction  which  proves  that  A(©^)  <=-  A(©2-)« 

For  each  (x,y),  let  ©""'"  =  ©'"'"(x,y)  be  defined  by 
©''""(x,y)  =  inf  y©!©  e/X,  (x,y)  e  A(©)j  ,  Then  ©'"""'  is  a  non- 
decreasing  function  of  t(x)  and  of  y,  and  is  a  uniformly  best 
estimator  having  the  same  location  functions  as  the  arbitrarily 
given  ©' •   If  each  best  test  is  admissible,  then  ©   is  admissibls, 
and  hence  Is  strictly  better  than  ©^'  or  else  it  is  equivalent  to 
6"^,  These  considerations  establish  the  following 

Lemma  2^   If  the  family  of  density  functions  f(x,©),  ©  e  /l^  admits 
a  sufficient  statistic  t  =  t(x)  having  the  monotone  likelihood  ratio 
property,  then  an  essentially  complete  class  of  estimators  is  con- 
stituted by  estimators  of  the  form  ©"  =  ©"(t,y),  any  nondecreasing 
function  of  t  and  of  y,  where  y  is  an  observed  value  of  an  auxiliary 
randomization  variable  Y  having  under  each  ©  the  same  uniform  dis- 
tribution on  the  unit  interval  0  ^  y  <  1,  and  such  that  t'  <  t" 
implies  ©"(t«,y')  ^  ©"(t",y")  for  all  y',y". 
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28 
If  t(x)  has  a  continuous  c«d»f»,  for  each  G,  then  estimators 

of  this  form  but  riot  depending  upon  y  constitute  an  essentially 

complete  class  of  estimators* 

8 •   Score  quasistatistics  and  generalized  maximum  likelihood 
estimators. 

For  a  given  family  f(x,0),  9  e  j'^,   let  e,(©),  9^{Q)   be  two 

functions  defined  on  JT-,  taking  values  in  .tl,  and  satisfying 

Q^{Q)   <   ^2(0)  and  Q^{Q)   ^  6  %   ©2^®^  ^°^  ^  ^  -^^   ^^®^  ^°^  ®^°^ 

©'  e  -TL,  a  best  test  of  H^:  0  =  e^(©')  against  E^'-Q  =  ^2^'^^  ^    ^^   °^® 

which  accepts  H,  when  the  quasistatistio 

S(x,©^(©),Q2(«))  =  [log  f(x,Q2(e))  -  log  fix, Q^{Q))]/[Q2{Q)-Q^{Q)] 

satisfies  S(x,Q^(e»),  ©g^®'))^  G(©«,a(Ot)),  where  G(©,a(0»))  is  a 
constant  such  that  a(©')is  the  probability,  i-zhen  ©'is  true,  that  this 
inequality  will  be  satisfied.   For  many  problems  the  functions 
©.(©),  ©p(©),  and  a(©)  can  be  chosen  so  that  the  generalized  score 
quasistatistio  v(x,©)  =  S(x,©^( Q) ,©2(0) )  -  G(©,a(©)),  ©  £  fX, 
satisfies  the  conditions  of  Corollary  1  and  hence  defines  an 
admissible  estimator  ©"(x)  as  the  solution  ©  of  the  equation 
v(x,©)  =  0,   If,  for  example,  Prob  -[v(X,P)  =  0|© j  =  0  for  ©  e  /I, 
and  the  set  i"x|f(x,©)  >  0 J- is  independent  of  ©  e  -O,,  then  each 
acceptance  region  Xxlv(x,9)  ^  0 1- gives  a  best  test  which  is 
essentially  unique  (a.e,  P^,  ©  e  -Tl)  ,  and  hence  admissible  for 
testing  H(©)  and  H(©-). 

Again,  as  ©2(e)  -  q^{Q)   — >  0,  S(x,©^(©),  ©2(©))  — >  S(x,©) 

=  ^  log  f(x,©), 
if  the  derivative  exists  at  each  x,  for  each  ©  e  y\}   consider  as 
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above  the  ( locally -be st )  score  quasi  statistic 

v(x,©)  =  S(x,0)  -  G(0,a(©)).  Again,  if  this  v(x,8)  satisfies  the 
conditions  of  Corollary  1,  then  an  admissible  estimator  ©"(x)  is 
defined  as  the  solution  ©  of  the  equation  v(x,©)  =  0*   It  is  well 
known  that,  under  a  mild  regularity  condition,  an  acceptance  region 
|x|v(x,©)  5  0 7  gives  a  locally-best  test  of  H(©)  and  of  H(©-);  under 
additional  mild  restrictions,  such  as  those  mentioned  above,  these 
tests  are  also  admissible.   The  case  G(P^a(©))  =0,  ©  e  J\  ,  de- 
termines (tlrirough  the  equation  S(x,©)  =  0)  the  maximum  likelihood 

A 

estimator  ©(x),  which  is  thus  shov/n  to  be  admissible  (and  to  be 
locally-best,  ioe,  to  minimize  a(u,6)  for  ©  near  u,  among  all 
estimators  with  the  same  location  functions)  provided  that 
v(x,©)  =  S(x,©)  satisfies  the  conditions  mentionedo   Estimators  of 
this  form  were  proposed  by  Tukey  [10]  on  different  theoretical 
grounds  in  connection  with  the  methods  discussed  in  Section  5  above. 

Estimators  defined  by  use  of  the  various  score  quasistatistics 
mentioned  may  be  called  generalized  maximum  likelihood  estimators. 

If  S(x,©)  has  (or  may  have)  discontinuous  distributions,  it 
can  be  <^eplaced,  as  may  be  desired  at  least  for  some  theoretical 
purposes,  by  its  continuous  probability  integral  transform 

a(x,y,©)  =  y^Prob  [S(X,©)  5S(x,©)|9], 
+  {l-y)oProb  [S(X,©)  <  S(x,©)l9]  , 

where  y  is  the  observed  value  of  Y,  an  auxiliary  randomization 
variable  having,  for  each  ©,  the  same  uniform  density  on  0  <  y  <  1, 
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Then  for  each  e,  a(©)  may  be  prescribed  arbitrarily,  and  the 
statistic 

v(x,y,Q,a(0))  =  a(x,y,S)  -  a{Q) 

has  a  continuous  distribution  and  takes  negative  values  with 
probability  a(0)o   In  suitable  problems,  with  suitable  choices  of  a(4. 
the  quasistatistic  v  so  defined  will  satisfy  the  conditions  of 
Corollary  1»  The  same  treatment  can  be  applied  to  the  form 
S(x,Q^ (©) ,Op(Q) )•   To  avoid  technicalities  of  little  intrinsic 
interest,  we  discuss  the  case  in  which  such  randomization  is  not 
used. 

If  Prob  |v(x,o)  =  0|q'1=  0  for  each  Q   e  /V,  then  each  such 
estimator  has  the  location  functions  a(Q-,e)  =  1  -a(©+,©)  s  a(©). 
If  a(©)  »  a,  a  constant,  such  an  estimator  is  a  confidence  limit; 
if  a(Q)  s  1/2,  such  an  estimator  is  a  median-unbiased  point  esti- 
mator.  In  the  important  case  that  X  =  (Y,  ,o»«Y  ),  a  sample  of  in- 

n 
dependent  observations  Y.,  we  have  S(X,'2)  =  >   S  (Y.  ,0);  the  normal 

^  1=1     1 

approximation  (based  on  the  Central  Limit  Theorem) 

a(0-,a,^)  =  Prob  {s(X,©)  <  0|oj:^^(0)  =  l/2 

(using  that  E(S(X,©)|e)  =  0)  is  often  close;  hence  in  such  cases 
the  maximum  likelihood  estimator  9(x)  is  approximately  median- 
unbiased.   If  S(X,©)  has  a  symmetrical  distribution  under  ©,  then 
clearly  ©  is  exactly  median-unbiased. 

In  some  cases,  as  illustrated  belovj,  a  family  of  score  quasi- 
statistics,  e.g. 
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v(x,©,a)  =  S(x,9)  -  G(e,a),  0  ^  a  ^  1, 
or 

v(x,©,a)  =  S(x,0^(e),G2(©) )  -  G(©,a),  0  ^  a  g  1, 

can  be  used  to  determine  admissible  confidence  curve  estimators 
e(x,a),  0  ^  a  5  1,  as  solutions  of  equations  v(x,Q,a)  =  0, 

Estimators  based  on  score  quasistatistics  have  direct  useful- 
ness, which  is  enhanced  by  the  simplicity  of  their  theory  and  of  the 
practical  techniques  for  their  use.   In  addition  they  are  of  special 
theoretical  interest,  due  to  their  relations  to  the  asymptotic 
theory  and  techniques  of  maximum  likelihood  estimation;  they 
generalize  and  justify  these  techniques  in  an  exact  sense ♦  The 
following  considerations  lend  them  further  intrinsic  interest:   For 
any  given  problem  of  estimation  of  Q^   consider  the  class  of  esti- 
mators having  specified  location  fxinctions  a(0-,Q),  a( ©+,©).   For 
each  ©  £  -Ttand  each  u  7^  ©,  u  e  f\.,   let  a(u,©)  =  mja  a(u,©,e"),  where 
for  u  >  ©  the  minimum  is  taken  over  all  estimators  such  that 
a(©+,9,©")  =  a(©+,©),  and  for  u  <  ©  the  minimum  is  taken  over  all 
estimators  such  that  a(©-, ©,©""')  =  a(fi-,Q),   Then  a(u,©)  is  the 
envelope  risk  curve  (iae.  the  minimum  of  the  respective  ordinates  of 
risk  ciirves)  for  the  class  of  estimators  with  the  given  location 
functions.   For  each  (u,©),  it  is  possible  to  attain  a(u,©)  in  the 
following  sense:  if  u  >  ©,  the  relatively  trivial  estimator  which 
takes  the  value  ©  with  probability  1  -  a(0+,©)  when  ©  is  true,  and 
which  takes  the  value  u  otherwise,  and  which  minimizes  a(u, ©,©*"') 
subject  to  these  conditions,  is  equivalent  to  a  best  test  between 
the  simple  hypotheses  ©  and  u,  of  the  indicated  size;  such  a  test 
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can  be  based  on  the  score  statistic  S(x,u,Q);  similar  remarks 

apply  to  the  case  u  <  9,  Each  such  single  statistic  S(x,u,©)  can 

be  embedded,  as  an  element,  in  a  score  quasistatistic 

S(x,9, (9),0p(0) )  for  ©  e  yx;  it  may  or  may  not  be  possible  to  define 

by  use  of  this  quasistatistic  an  estimator  which  has  the  specified 

location  functions*  An  estimator  can  attain  a(u,Q)  \iniformly  in 

(u,0)  only  in  problems  having  the  special  structure  described  in 

Section  7  above,  for  which  uniformly  best  estimators  exist.   In 

other  problems,  some  estimators  defined  by  generalized  score 

quasistatistic  attain  £:(u,©)  at  some  but  not  all  (u,Q)»   In  all 

problems,  the  computation  of  a(u,©)  requires,  calculations  of 

probabilities  of  events  defined  by  score  statistics  S(x,u,©);  and 

the  possibility  of  its  attainment  by  some  estimator  at  specified 

points  (u,©)  is  related  to  the  existence  of  suitable  score 

quasistatistics-, 

8.1   Larp:e-sample  approximations . 

If  X  =  (y-,,«».y  )  is  a  sample  of  n  independent  Identically 

distributed  observations  (non-identical  distributions  can  be 

n 
discussed  similarly),  S(x,Q..  (  9)  ,9o(Q))  =  }        S(y .  ,0^(  9)  ,92(  9)),  Let 

li(u,9)  =  E[S(y^,9^(u))  |9]  and  <r^(u,9)  =  Var  [S(Y^,9^(u)  ,92(u)  )  |9] 

exist  for  each  9,u  e  ^,  We  allow  9(0)  =  9^(9)  =  9  here,  taking 

S(X,9,9)  =  S(X,9)  in  this  case,  and  assume  that  9  (9), 92(9)  are 

fixed,  while  n  may  vary,  in  the  present  discussion. 

n 
In  the  special  case  v  (x,9)  =  )■  _  S(y.,9)p  which  determines 

^  i=l    ^ 
the  maximum  likelihood  estimator  9  (x)  as  the  solution  9  of 

v^(x,9)  =  0,  we  have  by  IChintchine 's  Theorem  (even  if  o-  (u,9)'s  do 

not  exist)  that  --  v  (X,u)  converges  in  probability  to  ti(u,9)  when 


o^ . 


■^£    "K 


TO! 


i     ~ 


<    : 


33 

Q   Is  true.   If  u«  <  ©  <  u"  implies  m-C^^S^)  <  ^l(<^,©)  =  0  <  [x{u"  ,9) , 
then  Lim  a(u,3,a  )  =  0  for  u  ?^  ©J  that  is,  Q^   is  consistent. 

Returning  to  the  general  case,  for  large  n  the  Central  Limit 
Theorem  gives  the  normal  approximation  to  the  distributions  of 


n 
v^(X,u,a)  =111  S{Y^,9^{\i),Q^{vl))   -  G^(u,a): 

r                             .  7   -,  /G  (u,a)-nix(u,c)A 
Prob  iv„(X,u,a)  <  0  ©f  i^  -S ]  . 

and  for  u  =  ©,  the  approximate  determination  of  G  (©,a): 

^G  (©,a)   N  T 

a  &  J  (-il \   or  G^(©,a)  *=  ^o-(0,©)|--L(a)  , 

which  in  the   preceding  formvila  gives 

frob  [v„(x.u)  g  o|e)  i  1  (- V^  ^,4}  +^?:-|i  r'(-))  . 

For  the  maximum  likelihood  estimator,  G  =  0,  corresponding  to 
a  =  -^  in  these  formulae.   Thus  the  risk  curves  of  the  confidence 
limit  estimator  9"   =  9   (x,a)  determined  by  v  (x,©,a)  =  0  are 
approximately 

r   ;^(h(u,0,a,n)),  u  <  ©, 
a(u,0,0^(o,a))  =  < 

11   -  l(h(u,©,a,n)),   u  >  ©,   0  <  a  <  1, 
whare 

h(u.a.a.n)  =  -  •K  ^^-?|  .  p^  l-l(aK 


-.11 


a 
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Here  the  sufficient  (and  necessary)  condition  for  consistency  of 
Q   (x,a),  for  a  fixed  a,  0  <  a  <  1,  is  again  that  u'  <  ©  <  u"  Imply 
^(u',©)  <  0  <  ix(u",0). 

The  verification  of  the  conditions  of  Corollary  1,  for  a  given 
v(x,©),  is  sometimes  difficult.   Large-sample  approximations  are  of 
some  theoretical  and  practical  help  in  this  connectionc   For  example 
for  a  locally  best  confidence  limit  estimator  0(x,a),  where 
X  ~  (yT*»«»y  )  3-nd  the  Y. 's  are  Independent  and  identically  dis- 
tributed, we  have  as  above 


and  we  take 


G^(©,a)  i  v^(T<©,©)  l~-^(a)  , 


v^(x,e,a)  =  S(x,e)  -  /n^Q,Q)Y'^{a)      . 


If  S(x,©)  satisfies  the  conditions  of  Corollary  1  (i.e.  if  for  each 
X  the  maximum  likelihood  estimator  6(x)  is  determined  as  the  root  © 
of  S(x,©)  =0),  and  is  decreasing  in  ©  for  each  x,  then: 

(A)  If  cK©,©)  is  constant  (this  is  the  case  in  some  examples  in  the 
following  Section,  but  not  in  most  examples),  then  v  (x,©,a)  is 
also  decreasing,  as  required  by  Corollary  1. 

(B)  If  cr(©,0)  is  decreasing  or  increasing  at  9  =  ©',  then  for  a 
fixed  X  and  a  sufficiently  near  0  or  1,  either 

-ncr(©,©)l"^(a)  or  -  n<r(Q,^)l~^(l-a)  =  ncy<©,©)l~^(a) 


will  be  increasing  more  rapidly  than  S(x,©)  is  decreasing  at 
©  =  6',  so  that  V  (x,©,a)  and  v  (x,©,l-a)  cannot  both  be  de- 
creasing in  ©  at  ©', 


eL 
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(C)   On  the  other  hand,  for  any  fixed  a,  0  <  a  <  1,  since 

V  (x,e,a)  =fz   [S(y  ,0)  -£le£5l^-l(a)], 
^  1=1     ^       Vn 

a  sufficient  condition  for  v  to  be  decreasing  In  0  is  that 

be  decreasing  in  ©,  for  all  values  of  y, «   Clearly  as  n 
Increases,  this  condition  becomes  a  less  restrictive  one,  being 
in  general  satisfied  for  a  wider  range  of  values  of  a, 
8.2   Local  approximations  for  locally  best  estimators. 

In  cases  where  there  exist  precise  estimators,  that  is 
estimators  whose  risk  curves  are  small  except  for  u  very  near  Q,  it 
is  natural  to  center  attention  on  sm.all  neighborhoods  of  the  possible 
true  values  P,  and  to  consider  estimators  x^rhose  risk  c\xrves  are 
relatively  small  in  such  neighborhoods,  such  as  those  based  on  score 
quasistatistics  with  ©g^^^-^l^®)  small  or  zero  for  all  ©.   If 
li'(u,©)  = -^v  M-(u,©)  andcr'(u,Q)  =::^(r(uo©)  exist,  then 

<^U  d  U 

h'(u,@,a,n)  =  -v.-   h(u,©,a,n)  gives  the  Taylor  series  approximation 
h(u,©,a,n)  ^  h(©,©,a,n)  +  h'(©,©,a,n)  (u  -  ©) 

and  a  corresponding  alternative  form  of  the  above  approximation  to 

a(u,©,0  (•,a)).   In  the  special  case  of  locally-best  score  quasi- 

2 

statistics,  since  ii(e,0)  =  0  and  ix'(©,©)  =  <x-(e,©),  we  find 


36 
h(u,©,a,n)  i  v/H<5<©,Q)(©  -  u)  +|"^(a)[l  +  ^Jq'q)  (Q  -  u)]  . 

In  the  first  term,  the  coefficient  v/ncr(OjCi)  of  the  error  (Q  -  u)  is 


\/I(Q),  where  1(e)  is  Fisher's  "Information  in  X  at  0."   The  second 
term  is  zero  for  a,  =  -w  and  for  the  maximum  likelihood  estimator;  for 
other  estimators,  the  first  term  dominates  the  second  as  n  increases. 
The  indicated  approximations  to  risk  curves  are 


a(u,Q,3^)  ^  a(u,Q,e^(.,,5))  ^  I(-  /H<J(Q,o).  fu  -  ©| ), 


and  for  a  /^  -^ 


a(u,O,0^(.,a))  if'l(-^cKe,9)(P-u)  +  T^{o.)[^^^^{Q^u)+l]) ,     n  <  (. 

\J1   -  ^  {»»■>    same  argument  «oo),   u  >  O, 
S  (more  roughly)  ^(- \/n(5'(o,Q)»  |u  -  o|). 

These  approximations  exhibit  the  approximate  normality  of  distribu- 
tion of  these  estimators  for  large  n.  Vftiile  locally  best  estimators 
are  in  general  not  comparable  with  other  estimators  (e»ge  those  above 
with  ©,(Q)  <  Op{Q)   for  all  Q)  having  similar  location  f\mctions 
except  in  problems  of  a  simple  structure,  the  designation 
"Information"  for  1{Q)    is  clearly  appropriate  and  useful  for  cases  in 
which  so  much  precision  is  attainable  that  interest  is  practically 
restricted  to  very  small  |u  -  ej,  in  which  case  an  appropriate  choice 
of  an  estimator  will  usually  be  one  which  is  locally  best  or  perhaps 
one  defined  as  above  with  Q^{Q)-Q   {Q)    small  for  all  ©a 
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It  should  be  noted  that  the  preceding  approximations  which 
utilize  a  Taylor  series  approximation  are  not  accompanied  by  bounds 
on  errors  of  approximations.   Even  in  cases  where  such  approximations 
are  very  close,  vinder  a  severly  nonlinear  transformation  of  the 
parameter  space  (Q  >'7  =  "^(O)  with  ft(o)  differentlable  and  increasing 
such  approximations  can  become  very  inaccurate.   Hence  the  principal 
concrete  value  of  such  approximation  formulae  seems  to  be  that  they 
provide  convenient  quantitative  conjectures  which  are  more  or  less 
plausible  but  which  require  independent  confirmation  (or  discon- 
firmation)  for  specific  problems  and  sample  sizes.   Similar  remarks 
apply  to  the  preceding  approximation  formvilae  based  on  the  Central 
Limit  Theorem  only^  with  the  qualification  that  such  approximations 
could  be  termed  "less  asymptotic"  than  those  which  also  use  the 
Taylor  series  approximation,  in  the  sense  that  the  former  approxi- 
mations are  Tonaffected  by  monotone  transformations  of  the  parameter 
space,  and  their  use  can  be  accompanied  by  use  of  the  known  boiinds 
on  errors  in  the  Central  Limit  Theorem  approximation* 
8 r. 3   Remarks  on  asymptotic  efficiency  of  estimators • 

The  theory  of  the  asymptotic  efficiency  of  maximum  likelihood 
estimators  (cfa  for  example  Cramer  [131*  pp«  500-50lj.)  utilizes  a  cri- 
terion of  asymptotic  efficiency  (l.c*  I4.69-I-I.9O)  which  is  restrictive 
in  that  it  applies  only  to  estimators  having  asymptotically  normal 
distributions  with  means  equal  to  the  parameter  estimated;  such 
estimators  are  clearly  asymptotically  median-unbiased  (probability 
of  xinderestimation  approaches  -^  as  n  increases).   It  is  advantageous 
to  use  a  less  restrictive  criterion  of  asymptotic  efficiency,  one 
which  applies  to  all  (sequences  of)  estimators  which  are  asymptote 
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ically  median-vmbiased.   In  order  to  embrace  confidence  limit 

estimation  as  well  as  point  estimation,  it  is  advantageous  to  define 

a  criterion  of  asymptotic  efficiency  which  can  be  applied  to  any 

sequence  of  estimators  whose  probabilities  of  underestimation  (at 

each  ©)  converge  with  increasing  n  to  a  fixed  constant  a,  0  <  a  <  1; 

any  such  sequence  may  be  termed  an  asymptotically  valid  sequence  of 

confidence  limit  estimators  (of  specified  coefficient  a). 

Under  broad  conditions  (some  simple  ones  were  given  above) 

consistent  estimators  exist;  it  is  then  natural  to  define  asymptotic 

efficiency  of  estimators  in  terms  of  the  properties  of  risk  curves 

of  estimators  in  the  neighborhood  of  the  true  value  of  Q:   an 

asymptotically  efficient  sequence  of  confidence  limit  estimators  may 

be  defined  informally  as  one  which  is  asymptotically  valid  and 

asymptotically  locally  best.   The  estimators  defined  above  and 

Illustrated  in  the  following  section  based  upon  quasistatistics  of 

the  form  v  (x  .Q.o)  =  S(x  ,Q)-  Q   (G.a)Drovide  examples  of  such 
n  n'  '        n'     n  ' 

estimators,  and  have  the  further  properties  of  being  exactly 
( non-assymptot ically )  valid  and  locally-best  (and  typically  ad- 
missible)* Additional  examples  are  based  on  quasistatistics  of  the 

form  v^^^n'^'"^  ~  ^^^n'^1  n^®^'^2  n^^^^""  "^  (^)°-)   where  as  n  increases 
®?  «t®)  ~  ^1  v.^^)  decreases  to  zero  rapidly  enough  to  give  the 
asymptotically  locally-best  property;  such  estimators  have  the 
further  properties  of  exact  validity  and  admissibility,  and  the 
fimctions  9,,  ^C^)  can  be  chosen  so  that  for  any  finite  sample  size 
a  suitable  emphasis  is  given  to  avoiding  errors  exceeding  specified 
positive  magnitudes;  for  practical  applications,  such  estimators 
seem  preferable  in  principle  to  (exactly)  locally-best  estimators* 


n  fr. 
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The  usual  asymptotic  theory  (I.e.)  is  free  of  the  important 
assiimption  (b)  of  Corollary  1  above.   From  the  present  non- 
asymptotic  standpoint,  for  each  Q   the  acceptance  region 
A(Q)  =  |xlS(x,fl)  s  Oj  represents  a  locally-best  one-sided  test,  and 
the  family  of  such  tests  can  be  used  as  usual  to  define  a  confidence 
region  for  estimation  of  o,  namely  U(x)  =  ^"^jx  e  A(o)Q  e  /T. '^  in 
general  such  a  confidence  region  will  not  have  a  constant  confidence 
coefficient,  but  its  theory  and  interpretation  in  applications 
follow  usual  lineso   The  fail\ire  of  assumption  (b)  corresponds  to 
the  failure  of  the  sets  A(0)  to  constitute  a  nondecreasing  sequence 
in  Q;  this  in  turn  corresponds  to  the  fact  that,  for  some  x,  the 
confidence  region  U(x)  will  fail  to  constitute  an  interval 
[Q"(x),'P]  which  can  be  described  by  a  lovrer  estimator  Q"(x),   The 
theory  of  admissible  confidence  regions  not  necessarily  of  interval 
form,  and  their  interpretation  in  applications,  lie  outside  the 
scope  of  the  present  paper.   However,  from  the  present  standpoint  it 
may  be  observed  that  the  principal  role  of  the  regularity  assumptions 
in,  for  example,  Cramer  (loC.)  is  to  guarantee  that  with  increasing 
n,  for  each  c  the  probability  that  U(x)  will  bo  an  interval  (or 
equivalently  that  S(x  ,Q)  will  satisfy  the  assumptions  of 
Corollary  1)  approaches  \inity:   More  precisely  with  increasing  n, 
for  each  Q   the  probability  of  the  set  of  poj.nts  x  on  which  S(x  ,u) 
is  decreasing  in  u  (at  least  for  u  near  o)  approaches  \inity.   The 
key  step  of  the  derivation  from  this  standpoint  is  the  observation 
that  the  law  of  large  numbers  applies,  when  Q  is   true,  to  the  s\am 

Sr   S(X  ,u)  =  > — -  S(Y. ,u),  each  terra  of  which  has  (at  least  for  u 

c^u    n      i=T  '^  2 

near  Q)  a  negative  expected  value  E[~p  log  f(Y,  ,u)|P],   (Similar 

au 
remarks  apply  to  use  of  generalized  score  quasistatistics  which  fail 


to  satisfy  condition  (b)  of  Corollary  1.)   Dropping  the  qualification 
"for  u  near  P"  gives  that  the  probability  of  multiple  roots  of 
S(x  ,Q)    =  0  approaches  zero  with  increasing  n«   asymptotic 
efficiency  properties  of  confidence  limits  and  intervals  defined  by 
use  of  quasistatistics  of  the  form  S(x^,0)  -  G^(©,a)  were  proved 
Tinder  broad  regularity  conditions  by  Uald  [llj.]  o 

The  remarks  of  Lehmann  [l5],  on  the  limited  value  of  any 
exclusively-asymptotic  theory  of  optimum  tests  apply  with  equal 
force  to  estimation  theory.  Asymptotically  efficient  estimators 
may  approach  efficiency  at  arbitrarily  slovj  rates  as  n  increases. 
Only  on  the  basis  of  an  auxiliary  non-asymptotic  investigation  of 
the  quantitative  and/or  qualitp.tlve  (optimality)  properties  of  an 
asymptotically  efficient  estimator  can  it  be  recommended  in  an 
application  with  a  specified  (finite)  sample  size. 

9.  Examples .  iSxamples  1-3  illustrate  that  the  formal  treatment  of 
Section  8  can  often  be  applied  conveniently  to  problems  admitting 
uniformly  best  estimators. 
Example  1»   Normal  mean.   Let  x  =  (y-|,..o  y  )  be  a  sample  of  n 

independent  observations  from  a  normal  distribution  with  known 

2 
variance,  say  cr  =1,  and  unknown  mean  P,  -  oo  <  Q  <  oo.   Then 


f(x,e)  =  (2Tr)   2q     i-i 


Let 


v(x,o)  ='il2^|i2Lz£L  .  G(a,a(Q)), 


»P 


cv.-.jt 
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where   a(P)    is  a   given  function*      Then 

v(x,o)    =  n(y   -  a)    -   G(Q,a(©))    =  ny   -  nQ  -    \/n  |"^(a(©))    , 


where  y  =  ~  )  '  y.  and  ^(u)  is  the  standard  normal  c.def.   Then 
n  1=1  1 

v(x,9)  clearly  satisfies  the  conditions  of  Corollary  1  if  g(Q)  is 
such  that  o  +  ■=—  "I"  (a(o))  is  increasing  in  Q;   as  n  increases,  the 
latter  condition  becomes  a  less  restrictive  one  on  a(Q);  it  is 
obviously  satisfied  if  a(Q)  5a,  0  ^  a  ^  1,   For  each  such  function 
a(Q),  an  admissible  estimator  9"(x)  is  defined  as  the  solution  9 
of  v(x,a)  =  0,  that  is,  of 


Q  +  _L-|-l(a(Q))  =  7  . 

Denoting  the  solution  by  Q,(y),  this  gives  Q'"(x)  =  Q(y);  Q(y)  can  be 
any  increasing  function  of  y  if  a(Q)  is  suitably  chosen.   For 
a(Q)  =  a,  this  becomes  (in  the  general  case  where  o-  is  any  positive 
number) 


9"'(x)  =  8(x,a)  =  y  ,  —  ]R"^(a)   , 

an  upper  confidence  limit  of  confidence  coefficient  1-a  (and/or  a 
lower  confidence  limit  of  coefficient  a).   Each  of  these  estimators 
is,  by  Lemma  2  above,  uniformly  best  among  all  estimators  with  the 
same  location  functions  a(©  -,©)  =  l-aC©  +,Q)  =e  a(e).   Taking 
a(Q)  =  -^  gives  Q(x,,5)  =  ©(x)  =  y.   Since  this  estimator  is 
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p  r.        -^ 
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Independent   of   the    value    assumed  for  cT  ,   the   classical    (maximum 

likelihood  and  mean-unbiased)  estimator  y  is  uniformly  best  among 

2 

all  median-unbiased  estimators  of  Q   even  if  cr  is  not  known.   The 

same  property  clearly  holds  for  the  classical  least  aquares 
estimators  of  linear  regression  theory  under  normality  assumptions , 
Example  2,   Normal  variance*   Let  x  =  (y-i»«»»  y^^)  b©  a  sample 
of  n  independent  observations  from  a  normal  distribution  with  known 
mean,  say  [o,  =  0,  and  unknown  standard  deviation  9  =  CT^  0  <  (S' <   oo. 
Then 

^^  _  1  ^  2 
f(x,0)  =  (2rrc5^)"  \     ^^     ^=^ 


Let  v(x,o)    =  ^  log  f(x,e)    -   G(e,a(Q)),   where   a(©)    is   a  given 
fvinction.      Then 

""^^'^"l   =^^^2  -  1)    -   &(a-,a(<<))    , 

where  s  =  —  y       y.  is  the  usual  vmbiased  estimator  of  ^,   For  a 

^  1=1      ^ 

2 
given ^,  —To  has   the   Chi-Square   distribution  with  n  degrees   of 

freedom;  hence  G({r,a(<T) )  -  h-0^  a(o-)^*"^*  where  OC^  ^  is  the  lower 
a-point  of  the  Chi-square  distribution  with  n  degrees  of  freedom. 
Thus  v(x,Q)    =  ^(^  -  7(^  a(c-)^'      •^^'   ^°^  example,   a(C-)   =  a,    then 

©''■(x)   =   ©(x,a)    =  3  V'^X^a 

which  is  a  uniformly  best   estimator,   by  Lemma  2  above,     A  xmiformly 


■pB 
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best  median-unbiased  estimator   of  <5"ls  (r(x,,5)«      Similarly,   uni- 

2 
formly  best  estimators   of   the   variance  cr     are   given  by 


2r..    -.^     _    _2_/v2 


cr''(x,a)  =  sVX;^ 


a 


When 


n  is  not   small,   n/"/^^     5  ~  •'■'   ^^'^  0-(x,o5)  =  s   and    c-^(x,.5)   -  5   . 

2 
Thus   the    commonly  used  point  estimators   s   and   s     can  be    justified 

on  the    grounds   that  they  are   uniformly  best    (among  estimators  with 

the    same    location  fixnctions)   and  very  nearly    (except   when  n  is 

very   small)   median-unbiased.      Tables   of  the    Chi-square    distribution 

ry       2 

provide  the  constants  /.    ^,   which  can  be  used  in  place  of  n  in 

2 
standard  procedures  for  computing  s  or  s  ,  to  obtain  the  estimates 

2 
a-(x,,5)  or  o-  (x,.5)  respectively.   Comparisons  of  these  and  other 

estimators  from  the  standpoint  of  median-bias,  with  tables,  were 

given  by  Eisenhart  and  Martin  [16].   For  the  more  usual  problem  in 

which  [i   is  unknown,  with  N  =  n+1  observations,  the  same  remarks 

2    N 
apply  to  the  usual  mean-xmbiased  estimator  s  =  >   (y.  -  y)/(N  -  1) 

i=l   ^ 

and  to  s.   The  theory  of  such  multi-parameter  problems  lies  outside 

the  formal  scope  of  the  present  paper. 

Example  3»  Binomial  mean.   Let  x  =  (yi,««»y  ),  where  the  Y. 's 

are  independent,  Prob  (Y.  =  1 )  =  0^  Prob  (Y.  =  0)  =  1  -©, 

0  <  ©  <  1,  Let  Z  be  an  auxiliary  randomization  variable,  vmlformly 

—  —    n 

distributed  on  0  <  Z  <  1,   Then  t  =  t(x,z)  =  ny  +  z,  where  ny  =  ]y  y . 

1=1  ^ 
is  a  sufficient  statistic  having  the  monotone  likelihood  ratio 

property;  hence  each  nondecreasing  function  9"(t)  taking  values  In 

the  \inlt  interval  is  a  uniformly  best  estimator.   The  classical 

(maximum  likelihood,  unbiased)  estimator  is  ©  =  [t]/n  =  y,  where  [t] 


.  t^.  — / 
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Is  the  largest  integer  not  exceeding  t.   By  use  of  binomial  tables, 

exact  confidence  limits  ©(t,a)  and  median  linbiased  estimators 

©(t,.5)  can  be  determined  easily  as  the  solutions  Q   of  the 

equations  a  =  Prob  (T  g  t|Q),  where  t  is  the  observed  value  of 

the  statistic.   For  typical  purposes  of  informative  inference,  it 

seems  preferable  to  dispense  with  use  of  the  randomization 

variable  z,   a  non-randomized  uniformly  best  point  estimator  having 

location  functions  closest  to  -w,  in  a  certain  sense,  is  defined 

for  each  observed  value  of  ny  as  the  solution  Q  of  the  equation 
Prob  (Y  <7l^)  =  Prob  (Y  >  y|'9);  this  estimator '5^(y)  is  easily 

determined  by  use  of  binomial  tables;  v/hen  n  is  not  small,  we  have 

^(y)  -  7    '      In  all  cases  the  effect  of  the  randomization  variable 

is  minor  except  when  n  is  small.   Thus  the  classical  mean-xinbiased 

estimator  can  be  justified  on  the  grounds  that  it  is  uniformly 

best  (among  estimators  with  the  same  location  functions)  and  is 

very  nearly  (except  when  n©  or  n(l-©)  is  very  small)  median-unbiased 

Other  discrete  examples  with  the  m,l»r,  property,  such  as  the 
Poisson  and  negative  binomial,  may  be  treated  similarly. 

Example  k»      Logistic  mean.   Let  x  =  (y-,,oo»y  )  be  a  sample  of 
n  independent  observations  from  a  logistic  distribution  with  unlmov/n 
mean  Q:      Prob  (Y  ^  y|©)  =  -^Jr(y  -  ©)  =  (i  +  e"^^"^^)"^,  -oo  <  y  <  oo, 
-  00  <  ©  <  oo;  Y  has  the  density  fimctlon 

t(y.Q)  =  e-^y-®V(l+e-^y-^^)2,  -  oo  <  y  <  oo  , 

For  any  fixed  /^  >  0,   taking  ©,(©)  =  ©  -  /\,  ©2(^)  =  ©  +  /\, 
determines  a  score  quaslstatistic 


-...3    :  z-r    I  ?   c. 
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s(x,©-A*'^+A)   =  2^ 


(log  ^|/(y.-a.A  )  -  log  vKy.-e-fA) 
1=1       ^  ^ 


For  any  fixed  a,  0  £  a  ^  1,  taking  a(©)  =  a  determines  a  score 
quasistatistic 

v(x,o,a)  =  S(x,Q-A,^+A)  -  G(0,a) 

which  satisfies  the  conditions  of  Corollary  1  of  Section  6  above, 
and  hence  determines  an  admissible  confidence  limit  estimator 
Q"  =  Q(x,a)  as  the  solution  Q  of  the  equation  v(x,0,a)  =  0,   Since 
Q  is  translation  parameter,  G(Q,a)  is  independent  of  Q,  and  may  be 
written  G(a),   By  symmetry,  G(,5)  =  0,   G(a)  can  be  determined 
approximately,  except  for  a  very  near  0  or  1  and  for  very  small  n, 
by  use  of  the  Central  Limit  Theorem:   let  ^l(u,Q)  and  a  (u,9)  denote 
respectively  the  mean  and  variance  of  S(Y,u-/\,u+A.)  when  6  is 
true;  then  [i{Q,Q)   =  0  by  symmetry;  x-ie   may  vjrite  ti,(u-©)  and 
cr  (u-Q)  because  Q  is  a  translation  parameter,   VJe  have 


Prob 


■[v(X,u,a)    <  0\q]  4  J   f0{a)-n^{n-9)\ 
*•  -  ;  V   v/HcrCu-O)    J 


which  provides  an  approximation  to  the  risk  curves  a(u,0,Q'")  of  the 

estimator  Q"   =  Q(x,a);  for  the  determination  of  G(a),  similarly 

Prob  |^v{X,P,a)  <  Olpj=   a  =  JiGia)/ ^cr-{0)\  or   G(a)   =  v/HrTt  0  )])"^  (  a ) . 

This,   with  the  formula  above   gives   the   approximate   risk  curves   of  Q' 
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a(u,a,o'')  = 

1  -  ^(•••saine  argument, •• )       for  u  >  ©• 

The  preceding  discussion  depended  throughout  on  the  chosen 
value  /\,  >  0»     A  locally  best  confidence  limit  estimator 
o'"'  =   Q(x,a)  is  determined  as  the  solution  9   of  the  equation 

v(x,Q,a)  =  S(X,e)  -  G(a)  =  0«, 

Here  S(y,Q)  =^   log  ^{j-Q)    =  a-^Ky-e)  -  l;Y(Y-e)  has,  when  ©  is 

true,  a  uniform  distribution  on  the  imit  interval;  hence  vjhen  Q 

n 
is  true  the  c.d.f.  of  J~p-(Y.-9)  (  and  hence  that  of  S(X,e))  can 

1=1    ^ 
be  calculated  as  in  Cramer  [131,  PP»  2Ijl|.-2I{.6,   The  normal 

approximation  gives  (since 

Cr^(O)  =  Var[S(Y,9)|e]  =  ^,  Var  [S(X,Q)1q]  =^),  G(a)  =  /^T'^io.)^ 

a  =  -^  gives  exactly  G{-p)   =   0  and  determines  the  maximum  likelihood 
estimator  Q  =   P(x,,5)«   In  general,  a  locally  best  confidence  limit 
estimator  P(x,a)  is  determined  (approximately,  except  for  a  =  ^)  as 
the  root  a  of  the  equation  S(x,©)  =  \/§1"'^(cl),  or 


IZf(yi-e)  =f  +|yf  F^(a)  . 


Such  an  equation  is  easily  solved  nvunerically  by  use  of  Berkson's 
tables  of  ^J^(u)  (  [17]  )• 
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The  present  example  serves  also  to  illustrate  the  deter- 
mination of  an  admissible  confidence  curve  estimator  by  use  of  a 
family  of  quasistatlstics  as  described  at  the  end  of  Section  6 
above.  Each  of  the  families  of  quasistatistics  v(x,©,a),  0  ^  a  s  1 
considered  here  (each  based  upon  a  fixed  /\  ^  0)   has  the  property 
that  8(x,a)  is,  for  each  fixed  x,  decreasing  in  a;  in  fact,  for 
each  X,  Q(x,a)  decreases  continuously  from  oo  to  -oo  as  a 
Increases  from  0  to  1,   Thus  for  each  observed  x,  each  0 
(  -00  ^  a  g  oo)  will  be  a  confidence  limit  d(x,a)  for  some  a;  we 
can  conveniently  determine  the  required  solutions  ©(x,a)  of 
v(x,o,a)  =  0  in  the  form 

a(x,C)  =  Prob  ■[s(X,Q)  ^  S(x,a)le(  =  1(  v/|  S(x,Q)) 

for  as  many  values  of  Q  as  desired, 

NiJimerical  example.      Let  x  =   (y-i^yp^yo)    =   (0,0,6),      Letting  P. 
denote   a  trial  value   of  9,   S.    =  S(x,P.),   and  a,    -  a(x,Q.), 
Prob  -[s(X,Q^)   ^  S(x,©^)lQ^|  ,    i   =  1,2,,,,,   and   taking   ©^  =  y  =  2 
as  a   trial  value  plausibly  near   Q(x,,5)    =  <?,   we    obtain 

Sq  =  2  7_'"^(yj_-2)    -  3  =  -0,559,    a.^  =  l(-.559)    =   ,288, 

Further  similar  computations  are  summarized  in  Table  1  and  in  Fig,  : 
a  sketch  of  the  confidence  curve  o(flj,x)  -  min  [a(x,©);,  l-a(x,0)]» 
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Table    I 
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approx,    a. 

exact   a. 

1 

2.0 

-0.559 

..288 

2 

ukh 

-0..256 

..399 

3 

1.18 

-0.758 

^70 

k 

1,12 

-0.031 

,o[i-88 

, 

S 

1,.08 

-o.,ooo5 

,,i|998 

,1+998 

6 

3.08 

-0..927 

..177 

7 

i|.0 

-1.166 

..122 

8 

5.0 

-1,511 

..065 

9 

6,0 

-2.0 

.023 

10 

7..0 

-2,.Ii.62 

.007 

11 

-1.0 

1.92ij. 

.973 

12 

-2,0 

2.523 

..99i| 

,998 

13 

0,0 

1,0 

.81+1 

.833 
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Figure    1 
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The  closeness  of  the  normal  approximations  can  be  checked  in 
the  present  case  by  use  of  the  exact  formula  (based  on  Cramer,  I.e. 

i  0  s  z  s  1  , 

a(x,Q)  =  ^f  -  |(z-l)^  ,  1  ^  z  s  ^  , 


1  -  ^(3-z)^  ,   2  g  z  g  3  , 


where  z  =  z(x,Q)  =  4(S(x,e)  +  3)«   The  approximation  is  seen  to  be 
quite  adequate  here.   In  other  examples,  if  exact  values  of 
a(x,Q)  cannot  be  obtained  by  use  of  standard  tables  or  tractable 
integrals,  one  may  consider  checking  approximate  values  of 
a(x,Q),  for  a  few  values  of  Q  of  particular  interest,  by  use  of 
(a)  the  error-bo\ind  on  the  normal  approximation,  (b)  numerical 
integration,  (c)  empirical  sampling  (Monte  Carlo),  or  possibly 
(d)  an  asymptotic  expansion.   For  (a)  and  (d),  see  V/allace,  [18]  • 

The  values  P.  above,  for  i  =  2,,,, 5,  were  determined  by 
P.^,  =  P.  +  S.,  based  on  Fisher's  formula 

e^^-|_  ~  ^i  "^  S(x,o^)/Var  [S(X,Q^)|o^]  for  iterative  calculation  of 
maximum  likelihood  estimates.   If  log  f(x,Q)  =  aQ  +  bQ  +  c  for 
some  constants  a  <  0,b,c,  at  least  for  0  near  ©(x)  (asymptotic 
theory  shows  that  this  will  be  the  case  with  high  probability  for 
sufficiently  large  n,  under  certain  regularity  conditions),  then 
S(x,0)  =  2aQ  +  b,^  S(x,©)  =  2a;  (aO^  +  bO  +  c)  is  minimized  by 
e*  =  -b/2a  =  Q  -   S(x,©)/  ^   S(x,e).  ^S(x,Q)  may  be  calculated 

directly;  or  approximated  niomerically  from  difference  quotients 

As(x,o) 

— -7—Q based  on  previously  calculated  a.'s;  or  (as  done  above) 


K  ^ 


:.j\ 


■ro    ^';OC•J••>0 


50 

"estimated"  by  its  expected  value:  for  sufficiently  large  n,  with 
high  probability  the  approximation 

|5S(x,9)  =  E[|5S(X,o)|©]  =  E[|-^  log  f(X,0)l9] 

=  -Var  [S(X,a)|0]  =  -  I(©) 


/> 


is  effectively  close.   The  rate  of  convergence  of  o.  to  ©  may  be 
slow  as  above,  for  samples  with  "improbable  configurations"  and/or 
small  n'f   use  of  ;i^S(x,Q)  rather  than  its  expected  value  here  would 
evidently  give  faster  convergence,  but  ;^rould  require  additional 
calculations  for  each  i.   Speed  of  convergence  is  not  of  exclusive 
interest  herej  since  a  number  of  values  of  a.  =  a(x,Q.)  are 
desired  for  a  sketch  of  the  confidence  curve  estimate,  any  convenle 
method  of  choosing  successive  Q. 's  may  be  used. 

The  values  Q,    and  9,-,  above  were  chosen  as  trial  approxi- 
mations to  the  confidence  limits  ©(x,.025),  9(x,9975)  respectively, 
by  use  of  the  asymptotic  formula  for  such  confidence  limits: 

3  t   l"^(.975)/Var  [S(X,0) jo]  =  9  +  2  . 

The  poor  approximations  obtained  provide  a  limited  illustration 
of  the  fact  that  such  approximations  are  "more  asymptotic,"  i.e. 
may  be  expected  to  be  often  less  close,  than  the  normal 
approximations  to  distributions  of  score  statistics. 

Example  *?.   Laplacean  me  ana   Let  x  =  (y-,,«..y  )  be  a  sample 
of  n  independent  observations  from  a  Laplacean  (double  exponential) 
distribution  with  unknown  mean  Q,   -oo  <  Q  <   oo,  with  density 
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fxmctlon 


h(y,^)  =  2   e~    ',  -  00  <  y  <  00  . 
For  any  fixed  A  >  0»  1^"^  v(x,Q,a)  =  S(x,Q  -  A*  -  +  A)  -  G(©,a) 


1 


n 


1=1 


(I'yi  -  ^5  -  Al  -  lyi  -  ©  +  Al) 


-   G(a) 


We   note   that 


2A 


if   e  <  y   -  A 


|y  -  o  -  Ai  -  ly  -  0  +  Al'  =  >  2(y  -  o)  if  y  -  A  ^  ©  ^  y  +  A 

L2A         if  y  +  A  i  "^^ 


and  hence 


-2/V  ^IZ  My.  -  "5  -Ai  -  \y^    -  ^  +  AD  ^  ^An  for  all  X, 
Since  Prob  ^Yge-AI'^f  =  ^  ^"^   »    ^^e  c.d.fo  of 
~(  I'Y^  -  ^  -  Al  -  1^1  -  "5  +  Af)  has  a  jump  of  (|  e"A)"  at  each 


end  of  its  range,  and  is  continuously  Increasing  between  these 
jumps.   Hence  G(a)  Is  well-defined  if  (^e~^)^  <  a  <  ]  -  (^e"-^)"; 
for  other  a's  use  of  an  auxiliary  randomization  variable  would  be 
necessary;  by  symmetry,  G(-^)  =0,  A  simple  computation  gives 

Var  (  fY  -  0  -  Al  -  lY  -  e  +  A!)  =  8(1  -  e"A  .  /\e~^) ,   =  v. 


say;  for  n  not  very  small  and  a  not  extreme,  the  normal  approxi- 
mation to  the  distribution  of  v(X,o,a)  gives 


"lo 
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G(a)   =    \/nv  J""^(a)      , 

For   any   a  bounded  as   above,   by  Corollary  1   the  estimator   ©(x,a)^ 
defined  as   the    solution  Q  of  v(x,©,a)    =   0,    is   admissible. 

The  median-unbiased  estimator  P(x)    =  Q{x,,S)   defined  as   the 
solution  0  of 

n  n 


n  Kyi  -  A)  -  ®1  =IZ  Kyi  +A)  -  ©I 

1=1       ^  1=1       ^ 

(which  is  easily   solved  numerically),   depends   upon  the    particular 
value  A  chosen;   the   error-probabilities   a(0  -  /\ j6,9" )^ 
a(Q  +  /j^,^,Q'  )   have   a  minimized  comraon  value   for  all   Q« 

Locally-best  estimators    ( " /\  — >  0")    a(x,a)    are    defined  by 
use    of 

v(x,o,a)    =  ^  I(y     >  ^)   -  IZ  I^y^    <  «)    -   G(a)    , 
i=l  ^  1=1  ^ 

where,  for  any  relation  R,  the  indicator-function  I(R)  is  defined 

by  I(R)  =  1  if  R  is  true  and  I(R)  =  0  If  R  is  false.   Thus 

n  n 

/    Ky^  >  ®)  -  /   I(y-  <  ®)  is  the  number  of  observations  y. 
1=1    ^        i=l    ^  ^ 

exceeding  8  minus  the  number  of  observations  less  than  Q;   with 
probability  one,  the  observations  y.  have  n  distinct  values,  and 
may  be  ordered,  y^^^  <  y^^)  <  ^o«  <  y(n)*   ^^®^ 


t^  -  '.A 


^  -        1    U3 


...    ^^» 


;;••■■;'      .. 
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r 


Ky^  >  a)  - 


Ky^  <  ©) 


n, 


if  «  <  y 


(1) 


n-l.      If  Q  =  y(^) 


< 


J     n:-2,      if  y^^)   <  Q  <  ^ ^^^y 


v_ 


-n+1.      If   Q  =  y(n)'   ^^'^ 

if  ^  >y(n)- 


-n, 


Let   r  be   any  integer,   1   g  r  g  n.      It   is  easily   seen  that   for 


hence 


a  =  1   - 


n 


—  iZ  ("),   G(a)   =  n  +  1  -  2r; 
2^  u=0     ^ 


n 


v(x,a,a)  =iZL  Ky.  >  <?)  -  ZZ  i(yn-  <  «)  -  (n  +  i 

i=l  ^  i=l  ^ 


-   2r)    . 


With  probability  one,  v(x,©,a)  =  0  will  have  a  \anique  solution, 
namely  P(x,a)  =  "^  i^\»      Since  G(0)  =  -n  and  G(l )  =  n,  ©(x,l)  5  -oo 
and  Q(x,0)  ^  oo.   For  any  observed  x,  the  set  of  {n+2)  confidence 
1  imi  t  s 


l^n, 


\sn 


[Q(x,l),a(x,l-(i)''),.,Q(x,(i)''),e(x,0)]  E  t-oo*y(i).y(2)"^(n)'°°^ 


serves  as  a  (locally-best)  confidence  ciirve  estimate,   (For  other 
values  of  a,  use  of  an  auxiliary  randomization  variable  would  be 
required  in  defining  v(x,Q,a),)   In  contrast  to  the  approximate 
confidence  limits  given  by  asymptotic  methods,  the  various  exact 
confidence  limits  here  depend  on  all  values  y.  in  the  sample  x  and 
not  only  on  the  value  of  0  =  y ,  .  +-i  )/2)/  ^^^    sample  median  (for  n 
odd). 


'I 


.^n- 


v-.J.'^^,    •;  V 
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For   the  more   general  problem  of  estimating  the  median   0  of  a 
Laplacean  density  function 

h(y,e,c)    =  ^  e"'^"'^l/°,   -00  <  y  <   00   , 

with  known  scale  parameter  c  >  0,  similar  derivations  give  the 
same  locally  best  confidence  limits  and  confidence  curve  esti- 
mators.  Since  these  estimators  are  independent  of  c,  they  can  be 
used  for  estimation  of  0  in  the  more  general  problem  in  which  c 
is  unknoxirn.   For  the  latter  problem,  they  remain  valid  and  locally 
best  (with  respect  to  errors  in  estimation  of  ©,  \iniformly  in  c ) , 
and  their  risk  ciirves  respectively  depend  on  the  argument  (u-©)/c. 

Still  more  generally,  let  the  Y. 's  be  independent  with  any 
continuous  c«d«f«  of  unknov;n  form,  with  unknown  median  Q«   Since 
the  estimators  of  Q   given  above  remain  valid  (have  the  given 
location  functions),  and  are  essentially  unique  locally-best 
estimators  ^^rith  the  given  location  fi^nctions  in  the  special  case 
of  Laplacean  distributions,  these  estimators  may  be  called 
admissible  for  the  non-parametric  problem  of  estimation  of  a  median 
of  a  (continuous)  distribution  of  ujiknown  form.   Similar  remarks 
apply  to  such  use  of  order  statistics  y / •  )   as  estimators  of  the 
p-quantile  of  a  continuous  distribution  of  unknown  form;  here  the 
generalized  Laplacean  density  function 


5^ 

for  which  Q  is  the  p-quantile,  replaces  the  Laplacean  density, 
for  any  specified  p,  0  <  p  <  1,  and  the  derivation  proceeds  in 
essentially   the    same  way  as  above   where   p  =  "^   • 

Example   6«      0,uantal   response   models  a      Let  x  =   {j-,f»j)f 
where   the  Y. 's  are   Independent, 

Prob    ^Y^  =  IjO?     =   P^(©),   ProbJy^  =  oIqJ    =Q^(©)   =  1   -   Pj^(O), 

1     ~    l,»eon. 


where  the  P. (Q)'s  are  known  increasing  functions  of  9,    having 

derivatives  P'(©),  ©  e  _rv.  =  (P,8),  an  open  interval.   Examples  in- 

^  -d  © 

elude:   (1)  Dilution  series  [19] i  P^(©)  =  1  -  e    ,  where  d^  is  a 

known  "dose"  (volume)  of  material  examined  in  the  i   observation, 
and  Q   is  the  unknown  mean  concentration  of  minute  particles  per  \init 
volume  randomly  distributed  in  the  material*   (2)  Mental  ability 
tests,  normal  model  [20]:  P^(9)  =  (l/k^)  +  ( (k^-l)/k^)l(a^+b^©)  is 
the  probability  that  a  subject  with  unknown  ability-parameter  ©  will 
respond  correctly  to  the  i   item  in  a  test.  Here  J^  is  the  standard 
normal  c«dof«,  and  the  parameters  0  <  k.  ^  oo,  -oo  <  a.  <  oo,  and 
b.  >  0  which  characterize  the  i   item  may  be  assumed  known  (or 
estimated  with  high  precision)  on  the  basis  of  previous  investigation; 
a.  represents  the  item's  level  of  difficulty,  b.  its  sensitivity,  and 
(l/k. )  if  positive  may  be  interpreted  as  the  probability  of  a  correct 
response  due  to  guessing  only,   (3)  Mental  ability  test,  logistic 
model  [21];  As  in  (2),  with  ^(u)  replaced  by  the  logistic  c.d.f. 
'^((Ic7)u)  =  1/(1  +  e^  "■'■•''' ^^),   This  very  slight  quantitative  modi- 
fication gives  a  model  which  is  equally  plausible  and  has  much 


;uo 


•i 
o 
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greater  mathematical  tractabllity;  in  the  case  where  l/k^  =  0,   it 
provides  a  sufficient  statistic  with  the  monotone  likelihood  ratio 
property,   (Ij.)   One-parameter  bioassay  model,  normal  form  [22]; 
P.(e)  =  (l/k)  +  ((k-l)/k)^(9  +  bd.)o  Here  Q   is  the  unknown  con- 
centration of  a  component  in  material  being  assayed;  the  case 
l/k  =  0  is  most  common;  d.  is  a  knovrn  dose  parameter;  b  is  a 
sensitivity  parameter  which  in  special  cases  may  be  knoxirn  or 
estimated  with  relatively  high  precision,   (5)   One -parameter 
bioassay  model,  logistic  form  [23]:  As  in  ik) >   with  Jj  replaced  by 
-|Tr  •   In  the  usual  case  l/k  =  0,  with  b  known  this  model  provides  a 
sufficient  statistic  with  the  monotone  likelihood  ratio  property, 
VJe  have 


sCy^,©)  = 


P?(e)/Pi(©) 


for  y^  =  1, 


Q^(©)/^i(©)  =  -P{(0)/(1-Pj_(«))    for  y^  = 


=  0, 


or 


S(y^,e)  =  ?[{o)/Q,^{9)   +  7^P[io)/?^{Qn^iQ), 


y.  =  0  or  1, 


and 


H^(u,e)  =  E[S(Y^,u)|o]  =  P^(u) 


■P^CP)   ^^{Q) 


P^(u)  "  Q^Wl 


<r-i(u,9)  =  Var  [S(Y^,u)|e]  =  T'l{vi)h?^i(^)\{Q)/-P^{n)\{u)^]      , 


[i^{Q,Q)    =  0,cr-2(©,©)  =  P^(e)^/P^(©)Q^(8) 


■   ^l.^i)   n2    Srt 


^'  ^ 
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The  normal  approximation  gives 


i   ^ 


Prob    [S(X,u)    S  kJQ]    =J    lfy~^  ^^(u,Q)  j 


"172 


For  a  given  (u,Q),  this  approximation  is  close  provided  that  (a)  the 

right  rcember  is  not  very  near  0  nor  1,  and  (b)  the  number  m  of 

2  2 

cr  (u,©)'s  near  max.cr. (u,©)  in  value  is  not  small© 

If  for  each  i  and  y.,  S(y.,e)  is  decreasing  in  0   (i.e., 

P^(9)P^(9)  <  P|(P)^  and  '^^(0)P![(0)  <  P{(0)^),  then  v(x,P)  =  S(x,9) 
satisfies  the  conditions  of  Corollary  1,  and  the  maximum  likelihood 
estimator  ©(x),  the  solution  of  S(x,©)  =  0,  is  admissible;  if  the 
normal  approximation  above  (with  u  =  9)  is  close  for  respective 
values  of  9,  ©  is  approximately  median-iAnbiased;  if  the  approxi- 
mation  is  close  for  respective  values  of  (u,9),  9  has  the 
approximate  risk  curves 

•  r  K-n t^i(^.e)/( XI o-?(u,©))i/2j^  u<o, 

a(u,0,©)  = 

l-5(..«    same   argioment    •••),  u>9» 


More    generally,   to  determine   locally  best    (approximate) 
confidence   limits  ©(x,a)   as   solutions   9  of 

v(x,©,a)   E  S(x,©)    -   (  YZ  a-?(9,9))^/2-|-l(^j    ^  Q^ 

i      ^ 

a  simple  adaptation  of  the  discussion  at  the  end  of  Section  8,1 
above  may  be  applied  to  the  problem  of  verification  of  the 
conditions  of  Corollary  1# 
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Example  7«   Rectangular  mean  Let  x  =  (y,  ,o««  y  )  be  a  sample 
of  n  Independent  observations  on  a  random  variable  Y  with  density 


^1  If  0 


-  1^  y  ^  Q  +1 


h(y,e)  =  } 


0  otherwise. 


with  Q   =  E(Y)  unknown.   Let  r  and  s  denote  respectively  the  smallest 
and  the  largest  of  the  observed  values  y.#   Let  o'''  =  ©^'(r,s)  be  any 
function,  defined  for  all  r,s  such  that  r  s  s  ^  r  +  1,  which 
satisfies  s  -  -^  5  0"(r,s)  S  3?  +  -^  and  which  is  nondecreasing  in  r 
and  in  s«   Then  o"(r,s)  satisfies  the  conditions  of  Lemma  1  since, 
for  each  ©  ,  jx|Q"  <  Q  \   and  Ix|©'"  <  ^  ^  satisfy  the  (necessary  and) 
sufficient  condition  given  by  Pratt  [2Ij.]  for  admissibility  of 
one-sided  tests  on  Q»     Venketeraman  [251  has  shown  that  such 
estimators  constitute  an  essentially  complete  class,  and  has  given 
minimal  complete  and  minimal  essentially  complete  classes  of 
estimators  of  Q, 

For  samples  of  size  n  =  2,  each  of  the  following  estimators  Is 
admissible  and  median-unbiased; 

©'"(x)  =  (r  +  s)/2,  the  usual  mean-unbiased  estimator. 

3  -  |,,  if  3  >  r  +  1/  /a  , 

r  +  (  /2  -  l)/2,         if  s  g  r  +  1/  /2  , 


ei(x)  =  ■'   ^'^ 


II 


e^(x)  = 


r  +  |,  if  r  s  s  -  1/  /2  , 

s  -  (  /2  -  DA,         if  r  >  s  -  1/  /a  . 


*  '^ 


*:     ;. .". 
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Among  median-unbiased  admissible  estimators,  ©'  is  vmiformly  best 
with  respect  to  errors  of  under-estlmation,  and  e"  is  uniformly 
best  with  respect  to  errors  of  over-estimation.  Analogous  confi- 
dence curve  estimators  are  easily  constructed. 

For  any  fixed  k,  ^  %  ^  %  i»   ^°^   testing  hypotheses  of  the  form 
H{0):0<0  orH(Q-):©<^,  there  is  an  admissible  acceptance 

0=0         O  O' 

region 


A(O^) 


=  [xl^g 


<  Q.   +  k. 


s  <  ©  + 
=  o 


il 


and  another  admissible  acceptance  region 


A.(e^)  =1^x1'^  ^  Q^   -  k,  or  r  g  ©^  -  I  j   . 

From  such  tests  we  obtain  admissible  confidence  limit  estimators  at 
each  level,  and  the  corresponding  admissible  confidence  curve 
estimator: 


c(©,x)  = 


0,  if0^r+iorP^s-|, 
2[^  -  \\Q   -  ^\]^,   otherwise 


If  X  =  (0«,9,1«1)  =  (r,s),  or  alternatively  if  x  =  (Oo6,l.ij.)  =  (r,s), 
we  obtain  respective  confidence  curve  estimates  which  reflect  that 
the  "amount  of  information  in  a  sample"  increases  with  (s-r): 


c(e,x) 


-.5 


0   .5  1.0  1,5 
e  


c(G,x) 


--.5 


A 


0   .5   1.0  1.5 


;  ■  t ; 
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Alternatively,  for  any  fixed  k,  -  |  f  k  ^  |,  there  is  for 
each  H(S  )  and  H(e  -)  an  admissible  acceptance  region 

A(o^)  ={xl(|  -  k)r  +  (|  +  k)  s  ^  e^  +  k  ^  . 

From  such  tests  we  obtain  admissible  confidence  limit  estimators 
at  each  confidence  level,  and  the  corresponding  admissible  con- 
fidence ciirve  estimator: 


c(o,x) 


0,  if0^r+-2ore^s--g  , 
¥^   -  ^I-U-fI^^  otherwise    . 


For  the  two  samples  considered  above,  we  obtain  the  respective 
confidence  curve  estimates  : 


c(e,x) 


-.5 


.5   1.0  1.5 


-^ 


g(8,x)  : 


0 


^ 4- 


.5   1.0  1.5 

; 9   > 


Since  the  last  curve  lies  under  that  given  by  the  first  estimator 
for  the  same  sample,  it  provides  stronger  inferences  about  fi.  This 
is  not  inconsistent  with  the  admissibility  of  the  first  estimator, 
which  provides  (at  most  confidence  levels)  stronger  inferences 
(shorter  confidence  intervals)  from  relatively  uninformative 
samples  like  the  first  sample. 
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Example  8.   Cauchy  median.   Let  Y  have  the  Cauchy  density 

function  h(y,0)  = — 5-,  -00  <  y  <  00,  -00  <  o  <  00.  Then 

TT(l+(y-S)^) 

S(y,©)  =  ^^^"^^p.  Taking  v(x,0)  =  S(y,o),  the  conditions  of 

Corollary  1  are  satisfied,  and  v(x,0)  =  0  defines  the  median- 

xmbiased  locally-best  estimator  ©'"'(y)  =  y.   However  for  a  j^  ^t 

0  <  a  <  1,  the  conditions  of  Corollary  1  are  not  satisfied  by 

v(x,©)  =  S(y,Q)  -  G(a),   For  x  =  ij-^,J2^»   ®"^®^  ^°^  ^  =  2* 

2 
v(x,e)  =  S(x,©)  =  yn  S(y.,e)  fails  to  satisfy  the  conditions  of 

1=1    ^ 

Corollary  1.   (For  {j^   -  y^ |  large,  S(x,Q)  =  0  has  three  roots  0.) 
Thus  in  general  there  do  not  exist  confidence  limit  estimators 
(nor  median-unbiased  estimators)  vjhich  are  locally-best  uniformly 
in  Q, 
1 0 «   Introduction  to  general  theory  of  admissible  estimators • 

To  illustrate  the  general  theory  of  admissible  estimators,  and 
the  place  of  the  methods  introduced  above  within  the  general  theory, 
we  consider  the  case  in  which  .T^is  f  inlte  :  i^  =  -\Olo  =  l,2,,,,kV, 
The  principal  features  of  the  general  case  (In  which  j^is  any 
subset  of  the  real  line)  can  be  illustrated  conveniently  in  this 
case,  for  which  the  complete  theory  can  be  developed  by  relatively 
elementary  methods.   For  any  such  estimation  problem,  we  have  a 
specified  family  of  density  functions  f(x,C!),  Q  =  l,.«.,k.   For 
each  estimator  Q"{x),   let 

fprob  [©"^(X)  =  ulQ],   if  u  ;^  ©  , 
b(u,©,8''")  =  -j 

\0,  if  u  =  ©   , 


6a 


for  u,©  =  l,,..k.  The  risk  curves  of  Q"  are 

^n  b(j,®,e'"),  if  u  <  Q, 

a(u,o,S'")  =<  0,  if  u  =  ©, 

b(J,e,e""),  if  u  >  e  . 


It  is  useful  to  interpret  such  an  estimation  problem  in 
relation  to  a  somewhat  different  statistical  inference  or  decision 
problem,  v;hich  for  brevity  we  shall  call  the  multidecision  problem: 
This  other  problem  is  that  of  choosing,  on  the  basis  of  an  observed 
value  X,  one  of  k  specified  simple  hypotheses}  it  may  also  be 
described  as  an  estimation  problem  which  lacks  a  parametric  struc- 
ture in  the  sense  that  no  ordering  of  the  labels  ©  of  the  k 
hypotheses  is  relevant  to  the  problem.  Any  meas\arable  function 
©"(x)  taking  only  the  values  l,..ok,  represents  both  a  possible 
solution  to  the  multidecision  problem  (a  decision  fionction,  or  an 
inference  function,  or  an  "estimator''  in  the  last-mentioned  sense) 
and  an  estimator  in  the  sense  discussed  above  a 

For  the  multidecision  problem,  the  merits  of  each  decision 
fionction  ©"(x)  are  represented  completely  by  its  error-probabilities 
tivj,©,©");  for  each  ©,  such  probabilities  are  the  components  of  the 
vector-valued  risk  function  of  ©"  at  ©•   The  general  goal  is  to 
determine  decision  functions  ©"  for  which  these  error-probabilities 
are  minimized  jointly  in  some  suitable  sense,  A  decision  function 

•M. 

©'  Is  called  admissible  if  there  is  no  other  for  which  all  corre- 
sponding error-probabilities  are  at  least  as  small,  with  at  least 
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one  strictly  smaller.   Complete  classes,  minimal  essentially  com- 
plete classes,  etc.,  are  defined  correspondingly  (cf.  Lindley  [26] 
and  Wolfowitz  [27].) 

A  simple  necessary  condition  that  0"(x)  be  admissible  for  the 
estimation  problem  is  that  it  be  admissible  for  the  multidecislon 
problem.  For  if  ©""  is  better  than  ©"  for  the  latter  problem, 
b(  j,Q,Q'" '")^  b(  j,Q,Q'")  for  all  (J,Q),  with  at  least  one  inequality 
strict  J  therefore  a(u,Q,©''"")  s   a(u,0,©*")  for  all  (u,©),  with  at 
least  one  inequality  strict.   Thus  the  admissible  estimators  are  a 
subclass  (typically  a  relatively  small  one)  of  the  admissible 
miAltidecision  functions.   Similarly  every  essentially  complete  class 
of  multidecislon  functions  contains  an  essentially  complete  class 
of  estimators. 

The  relations  between  the  estimation  and  multidecislon  problems 
can  be  illustrated  further  in  terras  of  techniques,  related  to 
Bayes'  formula,  vxhich  play  basic  roles  in  the  theory  of  each 
problem:   For  any  estimation  problem  specified  as  above,  let 
q  =  q(u,©)  be  an  arbitrary  real-valued  function  such  that 
q(u,©)  ^  0  for  u,  0  =  l,,.,kj  any  such  function  will  be  called  a 
weight  function  (for  the  estimation  problem).   For  any  such  q  and 
any  estimator  ©",  we  define  the  (generalized)  Bayes  risk; 

k   k  „ 

R(q*'^'~)  =  ZZ  rZ  q(u.^)  a(u,©,©'"')   . 
©=1  u=l 

On  the  other  hand,  for  any  multidecislon  problem  specified  as 

above,  let  Q,  =  Q,(u,©)  >  0  be  an  arbitrary  weight-function;  then 

it. 
for  any  multidecislon  function  ©  the  corresponding  Bayes  risk  is: 
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..     k   k  .,  • 

R«(Q,Q*"")  =EZEZq(^*®)  b(u,e,9")  . 
Q=l  u=l 


For  any  given  Q"    and  q(u,Q),  vje  have 


R(q,®'")  =IZ 


Q       '    u>a 


q(u,©) 


b(j,o,a'"')  + 


q{u,©) 


b(j,o,© 


J>u 


u<9 


^^ 


•■J 


Q 


j>3         u>a         j<a         u<©      _| 


Q(J,^),  b(j,©,Q""")   , 


where 


j>u>« 


^(J^-S)  =  ^ 


q(u,Q),    for  j  >  e  , 


0   ,   for  j  =  «  , 


EZ  q(u,o),   for  j  < 


For  each  ^,  ^i(j,'5)  is  nondecreasing  in  j  for  j  >  Q,  and  non- 
increasing  in  j  for  j  ^  ©;  that  is,  :(J,'5)  has  a  single  relative 
minimum  which  it  assumes  on  one  or  more  consecutive  values  of  j 
including  J  =  ©.   Thus  each  welght-fixnction  q(u,o)  for  the  esti- 
mation problem  determines  \miquely  a  weight-function  Q{  jj'S)  for 
the  multidecision  problem^  which  has,  for  each  9,  a  single  relative 
minim\am»  Conversely  a  weight-function  Q.(  j,0)  for  the  multidecision 
problem  having,  for  each  ©^  a  single  relative  minimum  (in  the  pre- 
ceding sense)  determines  ■uniquely  (through  the  last  equation)  a 
unique  weight-function  q(u,o)  for  the  estimation  problem.  Thus  the 
Bayes  solutions  ©'"  for  the  estimation  problem  (i.e.  the  fxmctions 
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Q^   which,  for  some  given  q,  minimize  B.{q,Q'")   are  a  subclass  of  the 
Bayes  solutions  for  the  multldecision  problems,  characterized  by 
the  preceding  restriction  on  the  possible  forms  of  the  weight 
fiinction  Q,(u,©)  for  the  latter  problem. 

For  any  given  weight-fionction  q,  the  determination  of  Bayes 
estimators  is  conveniently  carried  out  as  follows:   Let  ^  be 
determined  by  q  a:s  above*  Then  RCq,©"")  =  Rt(Q,,Q")  is  minimized  if, 

for  each  x,  O" {x)   takes  the  (a)  value  u  which  minimizes 

k 
y      Q(u.Q)  f(x,e),  A  simple  sufficient  condition  for  admissibility 

of  an  estimator  is  that  it  be  an  essentially  \anique  Bayes  solution 
in  the  sense  that  for  some  q  it  minimizes  R(q,©''),  and  every  other 
estimator  which  also  minimizes  R(q,0")  has  the  same  risk-curves 
a(u,0).   (A  related  sufficient  condition  for  admissibility  is  that 
an  estimator  be  a  Bayes  solution  with  respect  to  each  of  the  weight 
functions  q^,«»©q  ■,,  and  that  among  all  such  estimators  it  is  an 
essentially  unique  Bayes  solution  with  respect  to  some  q  •)  An- 
other simple  sufficient  condition  for  admissibility  is  that  an 
estimator  be  a  Bayes  solution  with  respect  to  some  q  which  is 
positive  for  all  u,©.  Every  admissible  estimator  is  a  Bayes 
solution  with  respect  to  some  q;  and  the  class  of  Bayes  solution 
with  respect  to  weight-functions  q  is  a  complete  class  of  estimatoE 

Various  specific  formulations  of  the  estimation  problem  can  be 
exhibited  as  special  cases  of  the  present  formulation.   For  example 
let  W(j,i5)  denote  the  loss  function  adopted  in  any  decision- 
theoretic  formulation:  the  loss  incurred,  if  0  is  true  and  it  is 
Inferred  that  Q  =  j,  is  equal  to  W(j,0),  Then  use  of  any  estimator 
6"   leads,  when  ©  is  true,  to  the  expected  loss 
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E[w(©"-''(x),e)l,©]  =111  b(j,e,e'"'')  w(j,o)  =  r{©,e'''')  , 

j 

a  real-valued  risk  fvinction  (of  0),   To  illustrate  the  frequently 
adopted  specification  that  losses  are  proportional  to  the  squared 
error  of  the  estimate,  we  replace  the  convenient  labels  ©  =  l,,.«k 
by  the  more  general  parameter  values  0  =  P,  ,  ^2'**'^if'  ^here 
0.  <  ^4 +] *  and  write  W(u,9. )  =  c( 0. )  (u  -  ©. )  ,  where  u  is  any 
value  in  the  range  of  ©",   (The  exnected  mean-squared  error  can 
generally  be  reduced  further  by  dropping  the  restriction  that  the 
range  of  9"'   be  the  range  of  ©.j  the  conflict  between  these  con- 
siderations disappears  in  typical  problems  where  the  range  of  ©  is 

an  interval.)   For  any  a  priori  probabilities  g.  =  Prob  (©.)* 
1  =  l,.,,k,  any  estimator  ©'"  gives  the  Bayes  risk 

YZ   g.r(0.,o'")  =  YZ   g.c(e.)  m  b(u,©  ,©")(u  -  ©  )'^ 
1=1^^        i^^u       ^         ^ 

=  R'(^,©''')  =  R(q, ©'■'■)  , 

2 
where  Q,  =  Q,(u,0.)  =  g.c(©.)(u  -  ©.  )  ;  q(u,©.  )  is  determined  by  Q,  as 

above.   Numerous  examples  are  treated  (without  restrictions  on  -TX-  ) 

In  the  texts  and  research  literature  of  decision  theory, 

A  simple  loss  function  for  the  estimation  problem  is  one  of 

the  form 

^0,  if  ©3_(0)  <  j  <  ©2(©)  , 


W(j,0)  =<c^(©),   if  j  ^  0^(©)  , 
{c^iQ)        if  i   >  Q^iO)    , 


where 
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Cj^(e)  ^  0,  o^{Q)   ^  e  s  o^^Q),  Q^{o)  <  Og^^^      ^o^  ®  =  1,2,. .,,k, 

This  gives  the  risk  function 

r(0,©'"")  =  o^{Q)8i{Q^{Q),Q,o''^)    +  c^(0)a(©2(9)  *Q.^"")  • 

If  a  priori   probabilities   g(Q)    are  adopted,   then   the   Baye  s  risk, 
with  the   use   of  Q" ,   is 


g(Q)[c^(e)a(©^(0),0,©'"')    +  c^{Q)a{Q^{Q),0,Q-^)] 

=  R«(9.,0'')   =  R(q,©*)    , 
where 

!g(©)c^(©),  if  u  =  Q^iQ), 
g{9)o^{9),  if  u  =  ©2^^)' 
0,  otherwise, 

and  Q(j,©)  is  determined  by  q  as  above* 

The  methods  of  Sections  6-9  above  can  be  characterized  in  the 
present  terms  as  follows:   Writing 

R(q,Q'''")  =ZZ  I  ZZ  q(^»^)a(u,©,o''~')  +ZZ  q(^+i*Q)a(u+i,e,a"''"y|  j 
u  |_  e>u  e^  J 

for  each  u  the  summand  can  be  interpreted  as  a  linear  combination, 
with  coefficients  q  >  0,  of  the  various  probabilities  of  errors  of 
Types  I  and  II  given  by  a  test  of  the  one-sided  hypothesis  H(u): 
6  S  u,  against  H»(u):  0  >  u,  where  the  test  has  the  acceptance 
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region  A(u)  =)x|©"(x)  ^  ^V  •   In  other  v;ords,  each  such  term 
(x^ith  index  u)  is  the  Bayes  risk  in  a  certain  one-sided  testing 
problem;  it  is  minimized  by  a  suitable  acceptance  region  A(u) 
(determined  by  a  technique  equivalent  to  the  Neyman-Pearson  lemma); 
such  Bayes  acceptance  regions  are  admissible  under  mild  conditions. 
If  the  estimation  problem  has  a  suitably  simple  structure,  and  if 
the  weight-function  q  is  a  suitable  one,  then  the  acceptance 
regions  A(u)  will  constitute  a  nondecreasing  sequence  in  u;  in  such 
cases,  the  Bayes  risk  in  the  estimation  problem  can  be  minimized  by 
minimizing  simultaneously  each  of  the  mentioned  terms  with  respect- 
ive indices  u  =  l,o«.,k.  The  Bayes  estimator  obtained  in  such 
cases  is: 

fu,  if  X  e;  A(u)  -  A(u-l),  for  u  =  2,,..,k 
1^1,     if  X  e  A(l)  . 

It  is  problems  having  this  structure  which  are  treated  in  Section 
6-9  above  (without  the  restriction  thatj-Lbe  finite).  The  method 
of  Section  8  is  represented  by  the  form  assimed  by  R(q,0")  for  the 
special  case  of  a  simple  loss-fvinction,  defined  as  above;  In  such 
cases  the  minimization  of  a  term  of  R  vjith  index  u  corresponds  to 
use  of  the  Neyman-Pearson  lemma  to  determine  a  best  acceptance 
region  A(u)  for  testing  between  two  simple  hypotheses. 

If  j^is  not  finite,  after  choosing  any  finite  subset  J\."  c:,  /\. 
(more  or  less  "representative"  of  j~\.   )  we  can  apply  the  above  simple 
computational  methods  to  determine  Bayes  estimators  of  Q  e  j\,'   • 
If  for  any  q,  the  Bayes  estimator  Q"  of  0  e  J^    is  determined 
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essentially  uniquely  on  the  sample  space  (up  to  sets  having 
probability  0  for  all  Q  z  n.) ,    then  Q     is  an  admissible  estimator 
of  ©  ej^«   In  this  way,  elementary  techniques  can  provide  a 
nxomber  of  admissible  estimators  Illustrative  of  the  variety  to  be 
foiind  in  the  full  admissible  class. 

11,  An  application  of  the  g:eneral  theory  g  estimators  having 
prescribed  precision  in  a  specified  regions  sequential  probability 
ratio  estimators.. 

It  is  sometimes  desired  that  an  estimator  have  high  precision 
in  some  interval  in  the  parameter  space,  ;-jhile  in  the  remainder  of 
the  parameter  space  much  lovier  precision  would  suffice.   In  general 
efficient  achievement  of  such  a  specification  requires  use  of  an 
estimator  based  on  a  sequential  sampling  rule.   One  formulation 
and  solution  of  such  a  problem  is  the  following;  for  illustrative 
ptirposes,  a  concrete  example  is  discussed. 

Let  Y,  ,yp,,«,  be  independent  Bernoulli  trials,  with 
Prob  (Y,  =  1)  =  0,  Prob  (Y.  =  0)  =  (1  -  0).  An  estimator  Q'"'   is 
required  which  will  have  high  precision  for  0  near  ,5»   This 
requirement  may  be  formulated  in  part  as  follows:   For  ©  =  ,i;  or 
,6,  the  probability  is  at  least  ,95  that  ©"  v;ill  be  closer  to  the 
correct  one  of  these  two  values;  in  terms  of  risk  curves  of 
estimators,  we  require  essentially  that  a(  •5*»i4-j©" )  S   tO^  and 
a(,5*«6,©")  <  e05o   (Fiorther  interpretations  of  these  requirements 
in  relation  to  the  general  notion  of  precision  will  appear  below, ) 
To  meet  these  requirements,  consider  any  estimator  Q"*",  and  consider 
the  test  of  the  one-sided  hypothesis  H:  ©  ^  ,5  against  H':  ©  >  ,5 
given  by  the  acceptance  region  Ixf©^"^(x)  <  •$  (  a      (The  description 
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of  the  sample  space  on  which  our  estimators  are  defined  remains  to 
be  specified*)   The  requirements  to  be  met  by  ©"  imply  that  this 
test  has  error-probabilities  not  exceeding  ,0$   when  8  =  ,1^  and 
0  -   ,6.   If  sequential  sampling  rules  are  allowed,  it  is  known  that 
the  last  condition  is  satisfied  most  efficiently,  in  terms  of 
expected  number  of  observations  Y.  required  \mder  8  =  .Ii.  and 
©  =  ,6,  by  V.'ald's  sequential  probability  ratio  test  [28],   (We 
discuss  such  tests  ignoring  "excess  at  termination";  in  problems  of 
the  type  being  considered,  this  entails  that  some  of  the  following 
equations  represent  close  approximations;  for  certain  problems,  no 
such  qualification  is  necessary.)   The  indicated  sampling  rule  is; 

Observe  Y, ,Yp,..,,  compute  after  each  observation  Y  the  sura 

m 

d  =  /   Y.  and  h  =  h(m,d  )  =  2d  -m,  and  terminate  observation  as 
m   i=T  ^  'mm' 

soon  as  either  h  =  k  =  log  (19)/log  (3/2)  or  h  =  -k.   The  resulting 
sample  space  is  S  =  jx|x  =  (y-,  ,..  .jy^^) ,  n  =  1,2,,.,;  |h(m,d^)  |  <  k 
or  =  kasm<norm  =  n>-.   The  conditions  specified  above  are  met 
(with  minimum  expected  sample  sizes  under  all  values  of  Q)   by  use 
of  this  sampling  rule  and  any  definition  of  ©'"(x)  which  satisfies: 

0'"(x)  s  -S  for  X  such  that  h  =  -k 
©*"(x)  >  ,5  for  X  such  that  h  =  kc 

The  definition  of  ©^~(x)  can  be  completed  so  as  to  make  it 
admissible  and  median-iinbiased,   (Because  S  is  discrete,  use  of  an 
auxiliary  randomization  variable  is  necessary  to  obtain  exact 
raedian-unbiasedness;  we  omit  such  randomization,  obtaining  an 
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admissible  estimator  which  is  approximately  median-iinbiased, ) 
Every  estimator  satisfying  the  preceding  ineqxialities  is  a  Bayes 
solution  for  the  above  stated  problem,  given  the  sample  space  S, 
The  determination  of  an  admissible  estimator  among  these  can  be 
interpreted  as  an  illustration  of  the  technique  of  using  a  sequence 
of  a  priori  distributions;  and  of  choosing,  among  all  Bayes 
solutions  for  the  first  such  distribution,  one  which  minimizes  the 
Bayes  risk  for  the  second  such  distribution. 
We  have 

S'''(x,o)  =  d^e  -  (n-d^)/(l-©) 
=  dy©(l-©)  -  n/(l-0) 

n(|  -  e)/©(l-Q)  +  k/2©(l-G),   if  h  =  k  , 
^n(|  -  Q)/qU-0)   -  k/2©(l-©),  if  h  =  -k  . 

For  any  fixed  Q^  <  -^y   S(x,0  )  is  an  increasing  function  of  n  as  x 
varies  subject  to  h  =  -k;  and  the  set  of  such  points  has  probability 
exceeding  -^  v;hen  Q  =  9  ,   To  determine  a  test  of  H(©  ):  P  ^  © 
against  H»(©  ):  ©  >  ©  ,  with  acceptance  region  Ax|f©"(x)  ^  ©  y, 
having  size  l/2,  and  having  the  property  that  it  is  a  locally-best 
test  of  this  form  subject  to  the  conditions  already  imposed  upon 
©'"(x),  it  is  necessary  and  sufficient  that  ©""'(x)  satisfy  the 
following  additional  condition:   Let  n(©  )  be  determined  by 


Prob  (h  =  -k  and  n  <  n(©  ) I©  )  =  4  , 

—      O  '  O      c. 
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In  general,  this  relationship  can  be  satisfied  only  approximately, 
but  always  closely  except  for  Q     very  near  0,   Then 


©"(x)  ^  Q^  for  X  such  that  h  =  -k  and  n  <  n(Q  )   , 
e"(x)  >  0    otherwise. 


As  0  increases  from  0  to  ^,  n(9  )  takes  successively  the  values 

k,k  +  1,  k  +  2,,,.    , 

Proceeding  sirallarly  for  any  fixed  S  > -o .  we  define  n(  0  ) 
^         "  "  o       c'  o 

similarly  for  such  values,  and  obtain  the  conditions 

0"(x)  ^  Q     for  X  such  that  h  =  k  and  n  >  n(©  )  , 
0"(x)  >  Q         otherwise   • 

It  is  clear  that  all  of  these  conditions  on  9"(x)  can  be  met 
simultaneously  (allowing  the  approximations  mentioned),  and  that 
they  provide  a  full  definition  of  the  estimator.   Since  this 
definition  depends  on  x  only  through  n  =  n(x)  and  h  =  h(x)  =  i  k, 
©  depends  on  x  only  through  t  =  t(x)  =  h/kii.   The  range  of  t  is 
i  1,  i  l/2,  t   l/3>««.  and  ©"  is  an  increasing  fiinction  of  t» 

Let  P(t,0)  =  Prob  {t(X)  s  tloj  ,  then  the  estimator 
©"  =  ©"(x,«5)  is  defined  as  the  root  Q  of  the  equation 

v(x,Q,.5)  =  F(t(x),Q)  -.5=0, 
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More  generally,  for  each  a,  0  <  a  <  1,  a  confidence  limit  esti- 
mator e"(x,a)  is  defined  as  the  root  P  of  v(x,Q,a)  =  0,   (The 
admissibility  of  such  estimators  can  be  shown  as  above.)   The 
family  of  such  estimators  constitutes  an  admissible  confidence 
curve  estimator. 

Confidence  curve  estimates  of  this  kind  v;ill  be  narrow, 
reflecting  high  precision,  when  n  is  very  large,  and  will  be  v/ide 
reflecting  low  precision,  when  n  is  very  small.   It  follows  from 
the  requirements  imposed  upon  ©'"(x,,5)  above  that  whenever 
P"(x,.5)  >  •$»   W6  have  ©"(x,o95)  >  •I4.  (vjhether  n  is  small  or  large); 
and  that  v/henever  ©"(x,.^)  <  eS,   we  have  0"(x,.05)  <  .6;  hence  the 
90  percent  confidence  interval  J(x)  =  [©'"'{x,,95) ,  ©'"(x,.05)]  will 
never  include  both  the  values  Q  =  ^l\.   and  S  =  .6,   (The  event 
n(x)  =  +00,  which  has  probability  0  \inder  each  Q,  gives 
J(x)  =  [.[]., .6]  and  Q"{x,,S)   =  •^.)  This  constitutes  a  useful  inter- 
pretation of  the  formulation  adopted  above  of  the  general  require- 
ment of  high  precision  for  8  near  »5« 

For  practical  reasons,  it  is  sometimes  necessary  to  terminate 
sampling  before  this  is  indicated  by  the  above  sampling  rule,  and 
the  question  arises  what  inferences  can  be  made  validly  on  the 
basis  of  such  partial  determination  of  an  observation  x.   Term- 
ination after  m  observations  with  jh(m,d  )  |1  <  k  is  equivalent  to 
observation  of  the  event  -l/m  <  t(x)  <  l/m.   For  each  a,  this 
implies  that  the  estimate  ©"(x,a)  (which  would  have  been  determined 
by  continuing  sampling)  satisfies  s'"(x,a)  <  (3""(x,a)  <  e'''(x,a), 
where  P"(x,a)  are  respectively  the  roots  Q   of  F(-l/m,©)  =  a  and 
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of  F(l/m,0)  =  a.   These  bounds  on  an  estimate  narrow  progressively 
with  increasing  lUa  VJhen  such  bovmds  on  an  estimate  (or  confidence 
curve)  become  sufficiently  narrow  for  the  purpose  at  hand,  sampling 
can  be  terminated  without  affecting  the  validity  of  the 
(approximate)  estimates  obtained* 

Concerning  the  computation  of  values  of  F(t,©)  required  for 
use  of  such  estimators,  the  function  P(0,Q)  of  0  is  the  operating 
characteristic  function  of  a  sequential  probability  ratio  test,  on 
which  there  is  an  extensive  theoretical  and  quantitative  llterati-ire 
for  a  wide  range  of  problems.   For  each  9,   when  P(0,e)  is  known, 
the  determination  of  F(t,©)  is  reduced  to  the  problem  of  deter- 
mining the  conditional  cumulative  distribution  function  of  n 
(the  nvimber  of  observations  required  for  termination  of  sampling, 
or  the  diiration  of  a  random  walk  with  two  absorbing  barriers)  on 
the  condition  of  termination  with  h  =  -k  ("acceptance  of  H:  ^  g  •5", 
or  absorption  at  the  lower  boundary),  and  again  on  the  condition 
of  termination  with  h  =  k  ("rejection  of  H" ,  or  absorption  at  the 
upper  boundary),   (The  unconditional  distribution  of  n,  together 
with  one  of  these  conditional  distributions  and  F(0,o),  determines 
the  other  conditional  distributiono ) 
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SCHEMATIC  ILLUSTRATIONS  OF  CONFIDENCE  CUR\^  ESTIMATES 
OF  A  BINOMIAL  PAxRAMETER  6  HAVING  HIGH  PRECISION  FOR  9  NEAR  i 


(A)  n(x)  very  small,  h(x)  =  -k 


c(e,x) 


+-.5 


.1   .2  .3  •h     .5   .^   .7  .8   .9  1.0 


(B)  n(x)  very  large,  h(x)  =  -k 
—  .5 

c{e,x) 


0   .1  .2   .3   .I4.   .5  .6   .7   .8  .9  1.0 


(C)  Bounds  on  estimate.?   sampling  curtailed  with  m  very  large. 


c(9,x) 
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